That is simply down to model quality. You still have DALL-E 3 via a GPT, but it is even less likely to follow a prompt consistently, despite its colorful style within a limited range.
Here’s tips I’ve self-constructed:
-
New session for each new creation is always best. The AI behind image creation will get hung up on its vision of prior image generations, and unless you have direct changes to make in an existing composition, you just get strange things: like a computer operator now operating nothing in the changed perspective.
-
Go into settings → customize ChatGPT → custom instructions, and turn off all the tools, including the function-less “DALLE”. They only serve as distraction to the AI.
Self-reflection by the AI you talk to is best. It can produce a much larger high-quality idea than the immediate call to an image creation model.
Follow the prompt technique in the code block of this post: Gpt4o api is producing bad images compared chatgpt gpt4o - #2 by _j
I went over to another company’s AI to employ similar prompt to turn the DALL-E image previously shown here into only language, and then we bring the language back to ChatGPT’s new image maker - many paragraphs instead of typical user input.
If you hover, you can see it is quite literal. You can do that language-amplification completely in ChatGPT context. The way a multimodal AI works, you’re not going to get random creative spaces out of describing nothing.