4o image gen in custom GPTs not following any instructions

My CustomGPT delivers oracle answers to questions and offers to visualize the result in an image. Since the GPTs got 4o native image gen, it does not follow any instructions relating to image gen at all. I want landscape half realistic concept art using the question and reply as content - purely visual, no text - and focus on two elemental colors from the oracle that has been drawn.

Dall-E did that, 99% of the time, in a consistent style. With 4o image gen I get - as an example - a portrait image resembling a tarot card, looking hand drawn on old paper, showing the complete text from the documents for the oracle, which is explicitly forbidden to show verbatim at all in the instructions. It actually sometimes resembles results from a non AI online version of the oracle.

But it is completely useless atm because it does the opposite of most of the instructions and stubbornly introduces all kind of associations coming from the base model, including showing symbols it associates with “mystical” that are totally wrong in this context.

I have tried positive instructions, strict no-gos and combinations of both. It had no effect whatsoever. None of it.

Is it simply that in GPTs 4o does not know to follow instructions yet? Have others gotten it to do what it’s told? If yes - how?

For now I have deactivated image gen completely in the public GPT and try to get useful results in a copy I made, without any success to far.

2 Likes

It is likely that that your current Dall-E compatible instructions are actually restrictions on the new model… Try changing the instructions in your model to get the desired effect… (Like talking to a different person with a different skillset)

Your method seems sound… Different models have very different outputs from inputs!

I suggest starting with the most simple of instructions and working up… Eventually I think you’ll get along!

(‘No Text’ is the best instruction I have yet for the new model… Not because text is not amazing in this model but because it’s maybe overemphasised)

I have already completely changed the instructions from the ones I used for Dall-E. I tried to apply my prompt style for 4o in the regular chat, but nothing is respected. Interestingly, after including a reference image for the style with the documents, at first I got the “tarot card” style, but after insisting in the same chat to follow the instructions, the output actually was what it should be. But of course I cannot have the users insist explicitly on following instructions.

My next attempt is a comprehensive styleguide as a document, which I reference in the instructions so I have more space to explain it all.

1 Like

I am honestly not an image expert… I tend to degrade everything to Macros…

The best I can suggest is to look at the images primarily here: The Official 4o and Dall-E image Megathread

And optionally here (It’s an older thread no longer active - but very relevant as the same model!): 4o ImageGen: Share your best pictures

I post my ‘Macros’ here as some kind of ‘key reference’ though there are many better artists than me…

Use their prompts as a reference…

Better still… Recognise them and ‘like’ those who add real value… That is ultimately what drives everything forward… The understanding of what works.

Add to the threads too… You are clearly interested in this and contributing well!

Indeed… Ask questions too… You will be redirected if off-thread to the appropriate space

Ok, so with a collection of reference images and a long document detailing instructions for images over 3 pages, that is referenced in the instructions as obligatory to follow, I could get it to abide by the instructions. If it is consistent and expresses the meaning of the context in all or the most instances remains to be seen. Feedback would be appreciated, search for NEO Oracle (Divination) in the GPT Store, if you are interested to see the results. :slight_smile:

Hi, welcome back!

I’ve encountered the same problem with several custom GPTs.
I realized the issue comes from the instructions for the image_gen tool:

After generating an image:

  • Do NOT mention anything about downloading.
  • Do NOT summarize the image.
  • Do NOT ask a follow-up question.
  • Do NOT say ANYTHING after the image generation.

The instruction basically forces it to stop after generating an image:
as if it’s simply telling it, “SHUT UP, your job is done!”

Let me show you my little son’s GPT as sample:

Animal Compare Pro

It was working perfectly, but over the past two days, it caused total chaos.

This GPT’s task is:

1 - Image Creation (wide size without text)
2 - Comparison Table
3 - Hypothetical Battle Scenario
4 - Battle Outcome Table
5 - Quiz Section

After update, it only followed first prompt.
After first prompt it destroyed everything.

It started writing text, instead creating images of animals.
And, also it stopped writing text after creating image with second prompt.

Since yesterday, I updated several GPTs which use image generation:
I added workflow, and a four few shots with sample chats with user:

Critical Generation Settings (Mandatory):
- Image Size: 1536x1024 (landscape orientation)
- Quality: high
- Transparent Background: false (always create full opaque background, naturalistic scenes)
- Embedded Text: Not allowed. Never place any text inside the image.
- Images must be realistic by default; however, if the user requests a different style (such as oil painting, watercolor, cartoon, etc.), adjust accordingly.

Sample Chat 1

User:
a crocodile and a hippo

Animal Compare Pro:
{sending image prompt to image_gen}
prompt: A realistic split-composition image showing a crocodile and a hippopotamus in a riverine landscape. The frame is divided horizontally: the upper half shows a crocodile resting on a muddy riverbank, half-submerged with intense watchful eyes, while the lower half shows a hippopotamus wallowing in the muddy river waters with its massive head above the surface. Both scenes are richly detailed, evoking the tension and wildness of African river systems. Without any text or chart inside the image.
size: 1536x1024
n:1
transparent_background: false
quality: high
{after creating image, continues its other workflows for Comparison Table, Hypothetical Battle Scenario, Battle Outcome Table, and Quiz Section}

Now it is working well, at least today:

2 Likes