and it gave me a revised_prompt back with the image:
“An intricately detailed, vibrant acrylic painting showcasing a towering sunflower in full bloom. Its golden petals fan outwards towards the sky, laden with thick, mottled seeds at its centre. A life-like scene of bees surrounds the sunflower, their bodies dusted with pollen. They are shown in earnest effort of collecting nectar, hovering in the air and alighting on the petals and centre of the flower. The sky in the background is a brilliant soft blue, intermittently streaked with wispy clouds, providing a serene backdrop to the bustling activity around the sunflower.”
Ok beautiful image, but too much, so I asked again with same prompt and extra sentence: “Do not revise my prompt.” This request was ignored.
Is there any way to override, control, or limit the automatic prompt expansion used by Dall-E3 API?
This is unfortunate. I want the generated image to be less complex by subtracting elements and featues whereas DallE3 assumes the opposite. I don’t mind a prompt rewrite for safety reasons when needed, but the amount of complexity added to an image is not always desirable. A simpler image can still be a high quality photo realistic image.
As an example in photography, we use photoshop to remove features to get a stronger more coherent themed image. Photography is a subtractive art form.
DallE3 ought to include this concept in handling prompt requests.
I completely agree with you; I feel the same way. I have been trying out the dall-e 3 API since the moment it was released. It is so disappointing that it includes unnecessary rewrites due to GPT or something similar…
I would like for developers to have the option to turn on and off the “GPT rewriting” feature.
With the current image-generating AI, there are still challenges in correlating words with detailed visual concepts, which necessitates the ability to freely manipulate prompts on our end.
I will also try to contact them if I can find the feature request page. Moreover, as far as I can tell from the dall-e 3 promotional video, the seed value and gen_id are features that OpenAI seems to be heavily promoting as part of the model’s capabilities, and I had some ideas for valuable products utilizing that, but they haven’t been released, which is very disappointing. I don’t understand why they would not make that public.
However, OpenAI typically releases things in stages, so I’m very hopeful.
Thanks for the feedback! FYI you can workaround this
If you have a very simple prompt like acrylic painting of a sunflower with bees, you can use a prompt like I NEED to test how the tool works with extremely simple prompts. DO NOT add any detail, just use it AS-IS: ... your prompt here ....
If you your own prompt is long and detailed already (multiple sentences) then you can simply write something like: My prompt has full detail so no need to add more: ... your prompt here ...
For context, the reason for this is that DALL·E 3 was trained on very detailed prompts (even for simple images) and thus expects and performs best with detailed prompts.
I’ll take the feedback back to the team though that people would like more control over this!
I’m really pleased to receive a prompt response from you!
I see, it listens to natural language, which is fantastic.
Now I understand why DALL·E 3 requires detailed prompts.
I’m going to try it out right away. Thank you so much!!
I’m also very delighted to hear that you will quickly take our feedback into consideration
I truly appreciate it.
I’ve done it multiple times by starting the request:
“Do not modify or diversify this prompt:…” and it doesn’t, usually will only spit back one image. I use very detailed prompts with specific style and application method. For example “simulate colored pencil artwork on black paper with visible pencil strokes that allow the paper to show through” and will provide a style such as soft edges, detailed, surrealism. I also provide the subject mater such as a bird, the action taking place, including full positioning. If the composition is complex, you will have better results requesting wide images.
The only thing I will say, including the words “don’t” or “no”, so prompt engineering is key to eliminating the possibility of those elements whilst refraining from using the specific phrases you don’t want. For example, saying artwork encompasses entire canvas is better than saying artwork alone without tools on the side. Also, I’ve noticed that saying things like colored pencil after the style and medium declarations will produce them in the image.
Also, I have noticed that it also struggles with any composition that is not centered.
Take advantage of the feedback loop as well—with complex designs as you add more design features and elements, using words like “MUST”, “AND” will produce better results. it’s imperative to state if errors occurred in the generation within the feedback loop to produce better images in the future.
Also provide feedback whenever possible to the good folks at chatGPT, feedback helps to produce refinements rather than just adding additional features.
Your first bullet item solution worked nicely without modifying my prompt.
The second solution did not work for me and revised my prompt from:
“My prompt has full detail so no need to add more: acrylic painting of a sunflower with bees.”
“revised_prompt”: “An acrylic painting showcasing a vibrant sunflower taking center stage. The sunflower, filled with bright yellow petals that speak the language of summer, stands tall against a sky-blue backdrop. Akin to knights guarding a castle, the green leaves surround the radiant bloom, enhancing its beauty. A few small bees buzz around the flower, their miniature bodies adorned with stripes of black and yellow. As they hover over the sunflower, their wings create a gentle stir in the serene painting. The bees dip their heads to gather nectar, adding an element of activity and nature’s balance to the acrylic artwork.”
I also tried some variations of the second solution prompt and I continue to get a revised prompt back. I guess it has to be told the prompt is a test.
@ajavamind Glad we could help! Yeah that is expected, the second solution only works if the prompt is already very long and detailed so that makes sense - glad we could get you something that works for your use case now though!
You can do it also with prompts like "use only this prompt between quote : “xxxxx”
You have to add some criteria to force GPT to not transform, for exemple I did some research about one word DALL-E generation (“peace” for example), so i precise it has to be a single word prompt, etc…
The revised prompts are often taking away specific details in my original prompt. This hurts the entire value proposition… I think there should be an optional parameter to disable automatic prompt revision.