Collection of GPT-image-generator 2.0 issues, bugs, and work-around tips (check first post)

windysoliloquy · May 4, 2026, 7:01pm

I just came for the popcorn.

Daller · May 4, 2026, 7:31pm

I have another idea that just crossed my mind, what could also create the pattern. However, I still know too little about the process of the new 2.0; nothing is publicly known about it.
It is speculated that it is a hybrid process made up of “Autoregressive Image Generation” and a denoiser that has already been known for a little longer. If someone has a lot of images with the same format, overlays them all, and the pattern is always the same, meaning the pattern structures always align, it could be due to how Autoregressive Image Generation creates the images.
Combining Autoregressive Image Generation could make sense. AG for precision, diffusion for creativity. It could also be the reason why fantasy images create more of the pattern than real images. So one question is: Does more creativity in the process also lead to more pattern formation, and not only the details alone? Creativity triggers it, the details reinforce it…? 2 methods not cooperate well in process?

I know the data for the OAI generators is not publicly known. But if someone knows details… If the pattern is not inserted intentionally, but arises from the current process without manipulation, it could be a side effect when two systems not work well together.

It would also explain why this pattern can have a toxic effect on the weighting data for training (toxic means nothing other than: Instead of improving the weighting data, it makes it worse.) If the same patterns keep appearing again and again in very many images, a system trained like that will think that the images have to be like that. Then more and more such images with these patterns will be generated, also with other methods or pure diffuser models.
(The diffuser systems have similar problems: If photos are loaded with a lot of noise or typical photography problems, these errors appear in the images. This 2.0 pattern can also appear if the weights are trained with it. This is especially important for technicians! They have to recognize these images and keep them out of the training, just like bad photo data.)

It could even be, that what we see is already such a training effect. (developers will know what i mean.)

So… one conclusion would be: If “Autoregressive Image Generation” creates a certain patch pattern, in a certain size, can it in cooperation with a diffuser, or whatever else 2.0 does during generation, produce these patterns? lines it up with patch sizes?
Can it be, we see two different methods in action in the picture? one for the flat surfaces, and one witch get triggered from complexity?
(Or is it caused by bad training data with stuff in it?)

It is all pure speculation.
If a technician reads this, maybe they have a few hints for us. Without information about the architecture of the generator, we can only try things out and observe. (That is how we did it earlier too, and we spend looong time to figure it out.)

jeffvpace · May 4, 2026, 7:48pm

It seems to me that you are a bit to speculative with all of this.

IMO, the most efficient service we can provide on this thread for people having problems with their images is to look at their prompt and provide the best possible prompt revision. All else could maybe be considered as a bug - this is otherwise to be known as the Scientific Method:

The scientific method is a systematic, iterative process for investigating observations, answering questions, and testing hypotheses through experimentation. It ensures objectivity by requiring reproducible experiments that distinguish between correlation and cause-and-effect. The core steps include observation, questioning, hypothesis formulation, testing, analysis, and communication.

Anything else may be a waste of time.

windysoliloquy · May 4, 2026, 8:32pm

You’re more than welcome to borrow one of my scuba rigs.

clearly doesn’t give a thought to the fix but sure focused on the potential exploit?

Am I hallucinating things correctly here?!

jeffvpace · May 5, 2026, 7:02am

Simplicity Is Elegance

Prompt

A surrealistic image of a brightly illuminated underwater alien city. The brightness level must be such that all figures and objects can be clearly seen.

That’s right - Zoom in for the details. Both images were created from the same simple prompt.

Sometimes, a concept image prompt can be obfusticated with complex verbiage. Just like an experienced software engineer can accomplish a task with 10 lines of code where an inexperienced software engineer will accomplish the same task with 100 lines of code.

Tina_ChiKa · May 5, 2026, 7:16am

What’s interesting here is:
It’s become more aesthetically looking again.

Hmm, looking at your prompt, the context has shifted. In the sense of moving from a natural, dynamic depiction to a rather static painting.

Take a look at:

the shape and colouring of the clouds have this ‘patterns’
the cloud touched by the wing isn’t smoke - it’s reminiscent of fabric
The bird is neither a kite nor a hawk - it is something that has the shape of a bird of prey
The rainbow looks as though it has simply been placed into the picture. This is because the gradient is slightly different here too. A dynamic effect that depends on where the observer is standing. Here, regardless of where I position myself as a viewer - whether as the bird or the viewer - the orientation of the rainbow is not realistic.

It seems that highly aesthetic images that can be fixed using prompts, as in the @Chain_L examples, show few of these patterns simply because all the details are determined by the prompts.

@Daller captures the aspect I was trying to illustrate here:
Generating clouds that have a flow because the kite’s wing is stirring them up.

Prompt

“Black-and-white or low-color realistic sketch, no cinematic composition.
A red kite (Milvus milvus) flying through a dense cloud layer, with a clearly visible deeply forked tail.
The wings must create visible aerodynamic effects: turbulent airflow, vortex trails, and irregular cloud displacement behind the wings.
Cloud density must vary locally, showing disturbed regions, gaps, and swirling patterns caused by motion.
A partially visible rainbow appears only in regions where light, water droplets, and viewing angle align correctly, not as a perfect arc.
No symmetry, no idealization.
Details must remain physically plausible when zoomed in: feather structure, cloud particles, and light interaction should show irregular, non-repeating patterns.
The scene should feel like an imperfect, real physical process rather than a composed image.”

My workaround here is not to make everything too artificial and simplistic, and not to include natural descriptions.

Instead, I try to demand strict scientific definitions:

The kite (Milvus milvus) is precisely defined:
its shape, colour and the fork in its tail feathers.

However, I had to use the scientific name to compensate for the vagueness. And even then, it’s still too blurry and there’s a lot of ‘noise patterns’. At least you can make out that it’s supposed to be a kite.

When I look at all your arguments, there are two perspectives:

(traditional) prompting is essential
(traditional) prompting is not crucial if the model architectures lead to noisy outputs

I guess, both approaches can be implemented by the models.

However, for customers in the scientific area, I see challenges:

They simply do not have the time to deal with overly rigid prompting. They want a suitable representation using their own vocabulary.

Not “yes, it sort of looks like a kite”, no, they want a kite by definition.

Or a mechanical engineer: they want a flow of air, whether laminar or turbulent – a “well, it sort of looks like that” won’t help them at all.

Tina_ChiKa · May 5, 2026, 7:26am

I agree with you on that, but there’s just one small point I’d like to make.

Natural scientists and engineers aren’t software developers - they work with parameters drawn from nature.

Not aesthetics, not rigid images, not simplification when it comes to distorting reality.
Because they rely on everything fitting together!

AI is also designed with this customer base in mind.

jeffvpace · May 5, 2026, 7:28am

With all do respect Tina, I really think you are over-thinking all of the above. “Beauty is always in the eye of the beholder.”

I just want to have fun creating images - life is too short.

Tina_ChiKa · May 5, 2026, 7:36am

No problem, it’s just I’m an engineer and I know what’s causing my colleagues headaches.

This thread is about bugs - so we can provide peoples with necessary workarounds.
And ideally, help OpenAI understand their customers’ concerns better.

jeffvpace · May 5, 2026, 7:39am

Me too - been a software engineer for 36 years.

help OpenAI understand their customers’ concerns better.

Well maybe some day they will release a new version that will meet your requirements.

Schlorboodungus · May 5, 2026, 7:39am

I don’t understand what some of these posts are getting at. The issue is pretty easy to recreate and doesn’t go away with specific kinds of prompts. If i want a fantasy painting of a landscape or village or whatever, it will look good with the first few results in a chat. But after that, different requests get increasingly more messy unless you direct it to photorealistic generations (and then only works on more mundane objects and nothing too fantastical). Using references also causes weird artifacting in many gens, like we see. BoyuanChen0 on twitter says they’re working on it and will announce when they have it fixed. That suggests to me that this is a bug and unintended behavior.

With Dalle-3 i could keep generating in the same chat with consistent quality. I don’t like that I can’t do that with image 2, that’s what i want fixed.

jeffvpace · May 5, 2026, 7:46am

Really don’t know what to say to you. What do you want to accomplish on this thread? Are you here only to complain about gpt-image-2 like others here?

Schlorboodungus · May 5, 2026, 7:47am

No. I only want to awareness of the bug to remain, and that some people still want it fixed if possible. I’m patiently waiting for the devs to fix it. If you want to antagonize and argue on the internet, you do you, but I would have thought you had more important things to do. Ta ta.

jeffvpace · May 5, 2026, 7:51am

What bug are you referring to? Dalle-E 3 consistency for gpt-image-2?

The issue is pretty easy to recreate and doesn’t go away with specific kinds of prompts.

You are part of a chorus of people on this thread who refuse to show prompts that result in subpar images. It could be that you need help with a prompt revision.

jeffvpace · May 5, 2026, 9:10am

Tina, I just now saw your thread: Prompting vs Structure - A Boundary Test

I now fully understand what you were taking about. Sorry for my confusion

Topic		Replies	Views
Multiple gpt-image-1 high fidelity edits lead to grainy result Feedback image-generation , gpt-image-1	33	3148	October 5, 2025
Collection of GPT-4o-images prompting tips, issues and bugs Prompting image-generation	30	4963	April 21, 2026
Your DALL-E problems now solved by GPT-4o multimodal image creation in ChatGPT? Community dalle3 , gpt-4o-images	49	11302	April 8, 2025
GPT-Image-1.5 rolling out in the API and ChatGPT Community announcement , chatgpt , api , image-generation , models	45	3487	January 2, 2026
OpenAI is making a huge mistake by deprecating DALL-E-3 API dall-e-3	56	3022	April 20, 2026

Collection of GPT-image-generator 2.0 issues, bugs, and work-around tips (check first post)

Related topics