I am asking Dall-E API, on a daily basis, to generate images through the prompts bellow (two versions tested). The only changing data in those prompts are the meteorological data list, depending on the meteo each day. It works great, except that form time to time, Dall-E adds writings on the picture, which is clearly forbidden by the prompts. The problem is that, when the image is not correct… I still have to pay for it, even if I have to regenerate it. Does any one know what I have to add/remove/adapt in my (last) prompt to avoid such writtings on the generated images ?
Original prompt:
Create a photorealistic landscape featuring an old farmhouse seen from far away in a peaceful field.
Use the following weather information to depict the scene:
latitude: 43.6669998169;
longitude: 4.0170001984;
temperature: 0°C;
humidity: 100%;
visibility: 10000 permille;
wind speed: 5m/s;
clouds cover: 0%;
rainfall: 0 mm/h;
snowfall: 0 mm/h;
hour of the day (24h format): 8;
month of the year: 1.
Use the temperature and month to determine the appropriate season and vegetation state. Adjust the lighting based on the time of day and cloud cover, with the source of light always at the right side of the picture. Represent wind speed through the movement of grass, trees, or clouds. Show humidity and visibility as indicated through appropriate atmospheric effects. Include any precipitation (rain or snow) if present. The overall scene should convey a sense of tranquility, except in any harsh weather conditions.Any text on the image, such as weather information, is forbidden.
Last version prompt (actually in use):
Generate a photorealistic image of a landscape showing an old farmhouse viewed from afar in a serene, open field.
It is CRITICAL that the image contains absolutely NO TEXT, NUMBERS, OR WRITTEN MARKINGS of any kind, whether intentional or accidental.
If any text is generated, the image must be discarded and regenerated.
Use the following weather data to depict the scene visually (no labels or annotations):
• latitude: 43.6669998169
• longitude: 4.0170001984
• temperature: 0°C
• humidity: 100%
• visibility: 10000 permille
• wind speed: 5m/s
• clouds cover: 0%
• rainfall: 0 mm/h
• snowfall: 0 mm/h
• hour of the day (24h format): 8
• month of the year: 1
Use the temperature and month to determine the season and vegetation state.
Incorporate soft atmospheric effects to reflect the high humidity and wind.
The scene should convey tranquility, except in any harsh weather conditions.
Remember: No text, numbers, or labels of any kind may appear in the image under any circumstances.
I recommend not to use negatives in a prompt with AI as the negative itself makes the outcome of it being generated more likely.
Example:
“Do not generate a dog!”
→ The AI sees the word “dog” which it has been trained on generating pictures of dogs.
It also sees the words “Do no generate a” which often appear in other images as well, making the final output highly likely to be of a dog.
The solution?
Most open source image generation models support a negative prompt, meaning what you don’t want in the image.
If you write “Dog” in the negative prompt, there will pretty much never be a dog generated in the image.
I’m not sure if DALLE supports negative prompts and if not - you can always look up open source image generation models.
Wow, that’s a news ! Do you think that an AI “without negative capabilities” mays react differently between “don’t draw a picture with a god” and “drow a picture without dogs” (negative action vs. negative item) ? Because in a way or another, I -HAVE- to be sure that no writings will exist in the output image
Without negative prompts your best bet is to avoid mentioning it in the first place.
You don’t want numbers generated? Don’t use the word numbers or numbers themselves (0123456789) in the first place.
Don’t want a dog to get generated? Never use the word “Dog” in the prompt in the first place.
With your use case of course this could be hard.
A possible solution is to first send a request to a chat endpoint with the weather data and ask it to return the weather in words.
Output would be something like “Slight rain with clouds” or similar.
You could THEN pass this description to DALLE, making the chance of it generating numbers in the image practically 0.
Wow, that is a cleaver idea! I’ll test it by first asking ChatGPT to transform the weather data list into a readable text, so the very base prompt (without any “textual” reference) can be used again. Thank you VERY much !
THAT WORKS ! Your idea to first transform the numerical data into text did the trick ! And even better : the resulting images are even more accurate now, especially regarding the latitude and longitude ! Now that ChatGPT first specifies where the scene takes placewith words, dall-E even succeeds to respect the place architecture
Remember also: DALL-E-3 on the API doesn’t have you sending your text directly to the image model. There is a language AI in front that is prompted to rewrite the text, along with a whole bunch of guidelines it must follow.
You can see what is being sent to the image model by extracting the additional API parameter response.data[0].revised_prompt
The endpoint’s intercepting AI doesn’t really work in a “talk to the AI” way, but if you are deliberate enough, and simulate or break out of the container containing an input prompt message, you can get it to follow instructions.
“Don’t include a dog”, though: the language AI doesn’t have enough sense to do anything more than pass that along.
You would be best not sending nonsensical things needing AI thinking, like “weather at this latitude” or “humidity”… Prompt exactly what is actionable portrayal of objects, background, style, format. Or have a better language AI that is instructed to break down a user input and transform it into things that can actually be depicted.