Collection of GPT-image-generator 2.0 issues, bugs, and work-around tips (check first post)

Here is a collection of tips and tricks and some weaknesses and limits too, focused on the technical part of the GPT-image-generator. It takes quite some time to realize relatively simple things. This might help save some time when experimenting.
Ugly, broken, incorrect or boring test images are also allowed here. It is less about creativity, aesthetics and good ideas, and more about probing the models for weaknesses on a technical level.

The first post includes all the findings, and will be updated from time to time.
I collected the findings on the first page. And if bugs are corrected, the text on the first page for them will be deleted.
For prompting tips, and cool images, check the galleries: Topics tagged gallery

Important:

  • It is not a typical gallery. It analyses mainly technical problems, and tries to find workarounds if possible.
  • There are no tips here for API or Python, only prompting for the image generator system itself.
  • What we do not analyze are jailbreaks! Such posts will be deleted!
  • **We all do this here as a hobby, so be kind to each other. :smiling_face_with_three_hearts:
You can find here an example of how it was done in the past, for older models

Collection of GPT-4o-images prompting tips, issues and bugs

Collection of Dall-E 3 prompting tips, issues and bugs

Some of the prompting tips for DallE 3 are maybe still relevant.

References and links

GPT Image Generation Models Prompting Guide

ChatGPT Images 2.0 System Card - OpenAI Deployment Safety Hub

https://openai.com/de-DE/index/introducing-chatgpt-images-2-0/

In case of access issues or other pure technical issues, check here if the problem is known: https://status.openai.com/


Bugs:

  • Visible Noise: The amplifier effect was fixed, by making subsequent images no longer dependent on the previous ones. But there is still visible noise in the image. It is visible, it is disturbing, it causes neurological stress like mp3 artifacts in sound, and it is still present.
    (… and it might possibly have a destructive effect on model weights if such images are fed back into training.)

Tips:

  • Aspect Ratio: It is recommended to send an exact pixel ratio or aspect ratio like 1536x1024 or 3:2, landscape can be different ratios now.

Fixed Bugs:

  • FIXED, Noise Amplification: The generator keeps some data from the made images, and reuses it for the next images. This is the cause of a very bad bug. It amplifies noise patterns very quickly, after just 3-5 pictures, the images are destroyed.
    At the moment, the only workaround is to restart the session by reloading the web page.

  • FIXED, No unique pictures anymore in the same session: a related problem is now that it is not possible to make multiple different images with different seeds using the same prompt. Each time, the session must be renewed for a new picture.

9 Likes

The generator keeps some data from the made images, and reuses it for the next images. This is the cause of a very bad bug. It amplifies noise patterns very quickly, after just 3-5 pictures, the images are destroyed.
At the moment, the only workaround is to restart the session by reloading the web page.

It is not entirely clear where the noise comes from. It could be a watermark that is mixed in to recognize AI images. That was an idea from @_j and makes sense to me. It could be part of a deepfake detection method, which unfortunately becomes more important the better the generators get.

But it could also be incorrect methods to add details or problems in the math and the method of the generator. Aside from a possible watermark, what it could also be is an incorrect method of adding details to an image. I have experimented with such methods myself. You can make an image more detailed by specifically modifying the latent space during image generation. Or by giving the image a starting direction by influencing the initial noise, so the start is no longer completely random.

Since the new generator is a hybrid method that, as far as I know, (is not publicly explained), it is difficult to say where the pattern comes from. But it is clearly amplified by the fact that the generator reuses image data. The patterns are already visible to me in the very first image and are disturbing, especially on structured surfaces or clouds. I can already recognize the pattern in the very first image in many images. I do not know exactly what my brain is doing, but it feels uncomfortable, just like the flicker confetti noise that DallE 3 generated. It is similar to the blubber distortion artifacts in heavily compressed MP3 files. Newer audio compression methods are better, they produce information loss that has less acoustic structure. Visually, the brain is stressed in the same way as with acoustic distortions, even before the pattern is destructive.

No unique pictures anymore in the same session: A related problem is now, that it is not possible to make multiple different images with different seeds using the same prompt. Each time, the session must be renewed for a new picture.

It would be an easy fix for OpenAI: Just do not reuse image data for the next images, and two problems are fixed at once—no noise amplification and no identical images in the same session anymore. The reedit anyway not work so. And for a reedit, the edit page shold be used for this, including re-prompting or prompt base corrections.
(For testing, I usually make 10 pictures of the same prompt. This means now I must restart the page for every new generation. Very toilsome. I think many customers make more than 1 image with the same prompt if they like the outcome.)

This are 5 images in a row, with the same prompt:
Each generation reinforces the pattern.
And all images are the same.

**

**

Here every prompt was different, this is the 5 generation:
Same effect, it keeps data from the previews pictures.
The destruction is even worse, because of the differences of the image data each step.

4 Likes

The quality is definitively better then what i know from DallE3.
But the Bug ruins it.

Here DallE 3, same Prompt:

1 Like

For the people remembering the DallE 3 times.

3 Likes

This (diagonal grid artifacts) can happen with the first image in a new chat if you tell it “in the style of” and it runs a background search to find examples to use the style from.

GPT-Image-1, near release

GPT-Image-2

You’d be correct that everything that is identical is prompted in storybook fashion, down to the color of the rug that is now visible when a portrait aspect ratio seems automatically preferred. “The image has a cinematic, photorealistic quality, blending real-life texture with digital fantasy” gives latitude for either appearing to be a convincing photograph or hyper-surrealism.

The images are a bit soft, but that’s better than a final pass, after a promising photographic preview, that turned distorted and twisted in previous model. High resolution on API to be then downsized might mitigate that a bit.

3 Likes

Yes, the generator is clearly better than 1.0 (probably also 1.5, which I missed). I found the old one especially very boring and uncreative for the kind of motifs I make. The new one is clearly more creative, I still need to see if I can mix forms like with DallE 3.

And yes, the resolution is no longer as fixed. In the first test above, the generator varied the landscape format during generation, that wasn’t in the prompt.

The distortions in the last pass with 1.0 look like a typical refiner effect. I’ve seen this technique in offline models as well. But there are better methods to add details if you modify the latent space directly during generation. 1.0 was boring.

But I can’t really test yet because I constantly have to restart the session.
I don’t have API access, I’m speculating that these error patterns don’t appear in the API?
I hope they fix this quickly, then I can start testing. I have many prompts I want to try and compare.

1 Like

Hello, welcome to the forum.

Do you mean the same distortions as in the first images, or a specific style?
You can post examples if you have found errors or technical weaknesses.
Ideally with information on how to reproduce them. If they depend on the prompt, feel free to include the prompt that triggers them.

Yes, the same kinds of distortions as shown on the first image. I notice it mainly when I am requesting painting styles. Here I asked it to do the painting of Joan of Arc in the style of Alphonse Mucha’s Slav Epic. I could see in the thinking notes that it went out and found some reference images of the Slav Epic. If you look at her outstretched arm, you can see some of those light and dark spots that match the artifacts from the first image in the thread.

2 Likes

In contrast, if you do not use any reference images or previous images in the chat, it can do very high quality without any artifacts.

1 Like

You still have to start a new session? What I noticed yesterday, when I prompted images on ChatGPT …that the quality was still terrible, which leads me to think that gpt-image-2 wasn’t under the hood for me yet. Anyway today it’s clearly the new generator, it was thinking what it was going to generate and output so much better quality. So hopefully we don’t need to start a new sessions every 5 images​:crossed_fingers:And I really like the outputs from gpt-image-2, I think it will be fun to work with​:smirking_face:

1 Like

Yes, I noticed the patterns already in the very first generation, but it doesn’t even matter which style it is. But photo realistic images maybe not show it as clearly as a painting in the first generation. I think there are motifs that make the pattern more noticeable, but it is there from the beginning. And it is definitely a flaw. You can’t get rid of it with prompts, not even by changing the style. Unfortunately.

I think the pattern is always there, you can reinforce it by repeating the image with the same prompt in the same session.

Definitely, some of my tricks no longer work. 2.0 is linguistically more precise.
This is a “Bluepod”, for DALL·E 3 and for 2.0.

DallE3

IG 2.0

1 Like

I’m having the exact same issues even on the very first generation, whenever I have used a reference image.

Hopefully this is fixed soon. Makes my workflow almost impossible at the moment as I use reference images frequently.

2 Likes

A second prompt of “Remove the noise from the image, while keeping all the lines.” or something along those lines can often repair most of this.

Welcome in the Forum.

I think the developers know now about it because everybody can see it.

It must be fixed!

Here a example of editing.

In the first image, the image was uploaded and recreated, removing the people in the process.
It disturbs my brain. The noise causes a kind of neurological stress like mp3 artifacts.

In the second image, a prompt was generated from the original image, and this prompt was used for generation.

(Needs more testing to explore this further, but that can only be done once the bug is fixed.)

1 Like

My first output from a simple prompt.
An elderly dockworker in oil-stained canvas overalls sits on a massive rusted ship anchor in an abandoned shipyard at overcast dawn, cranes visible in mist behind him. He is brushing flakes of rust off the anchor with a stiff wire brush, orange dust falling onto his boots. Soft diffused overcast light from above, warm bounce from the rust surface onto his forearms, subtle rim light from the gray sky. Medium shot, Canon EOS R5, 85mm f/2, sharp focus on the rust texture where the brush meets the metal. Documentary photography, tactile industrial realism, layered rust colors from burnt orange to deep brown. No sparks, no fire, no smoke, no heavy grain, no color grading beyond natural rust tones. Aspect ratio 16:9.
The rusty anchor texture is noisy, and ‘digital ripples’ have washed over the bottom of the shot, blurring the character’s pants, boots, and the ground.

2 Likes
your prompt

An elderly dockworker in oil-stained canvas overalls sits on a massive rusted ship anchor in an abandoned shipyard at overcast dawn, cranes visible in mist behind him. He is brushing flakes of rust off the anchor with a stiff wire brush, orange dust falling onto his boots. Soft diffused overcast light from above, warm bounce from the rust surface onto his forearms, subtle rim light from the gray sky. Medium shot, Canon EOS R5, 85mm f/2, sharp focus on the rust texture where the brush meets the metal. Documentary photography, tactile industrial realism, layered rust colors from burnt orange to deep brown. No sparks, no fire, no smoke, no heavy grain, no color grading beyond natural rust tones. Aspect ratio 16:9.

Server overloaded?
Model overloaded?

I haven’t seen that strange texture in a couple of weeks, so it’ll probably resolve…

it does seem to nest in context somehow… so it’s probably not either overloads mentioned…

it was annoying those two days.

1 Like