GPT-Image-1.5 rolling out in the API and ChatGPT

Not so sure, when you get down to “application”.

“Redesign this UI (of openai.fm), proposing four different designs” - when run at the introduction of the prior image model on ChatGPT, I got lots of brainstorming text before the image tool was called, and then actually received four different designs of varying style:

Prior


Now

GPT-5.2 with image tool (that still cannot be shut off in “personalization”) went a whole different direction now to fulfill my identical request for multiple designs, saying nothing about what it was doing (and not producing any context observable by the image model) and providing very little deviation from the input image, even the same orange button…all in one image.

ChatGPT delivers no variations, will stop you at a single image. No regenerate button enabled.

So retry with a whole new chat against gpt-4o. Same: no chat, same symptom.


Let’s see how well it can avoid distorted “monster” human faces when small, a particular artifact that plagued the prior model.

Old: Zoomed to 3 images produced on a 4x5 grid

New: Zoomed to 3 images produced on a 4x4 grid

Faces are done much better. 12% three-armed people in that image, though, and interesting range of motion there. It seems the dark sepia look might be mitigated.

I’m comparing to this slop with a minimal prompt…

Just going off gut mostly… Appreciate your better comparisons!

Testing GPT-Image-1.5 with a playful backflip dolphin vibe, same prompt base, different spins.

This model has some serious glow-up potential😏


Maybe we can all agree on how it makes its gibberish look more polished…

Specifically when we’re intentionally pushing the 4th wall with providing garbage as a reference or simply asking too much of any current model?

I mean…

@_j I get this feeling, maybe just a hunch… Is that someone been tinkling in your Cheerios?

I do see improvements and I saw them the night before the announced release so there was no placebo…

Some stuff is better….
We still can’t prompt an image that gets 3-D printed and breathes, yet.

To answer my own question in case helpful to anyone else (I do indeed now have access to the new model)

– Mask editing (inpainting) does not seem to be improved; seems the model still redraws the whole image, not just the masked area. In fact I am getting better performance simply adding more detail about exactly what I want changed (and where) in the prompt without using a mask, than if I try the same thing using a mask.

– However, fidelity is certainly much improved when using a source image, in terms of maintaining consistency of elements from a provided image.

Guess I should read the fine-print:

The image generation tool allows you to generate images using a text prompt, and optionally image inputs. It leverages GPT Image models (gpt-image-1 and gpt-image-1-mini, and we’re working on support for gpt-image-1.5),

Was trying to update with the internal tools and kept getting errors. Looking forward to seeing this supported!

Now I’m officially obsessed with gpt-image-1.5, the quality is ridiculously good :relieved_face:

Office Party #2 (gpt-image-1.5)

Hi @windysoliloquy, for me this image us after my chemical learning very impressful.

Could I copy/take from you?

Thanks! :handshake:

sure it’s fair game but none of it is accurate under the letters, and i don’t even trust the letters

Yes, that was the context. The major upgrade is that it doesn’t take any prompting skills any longer to a) place a person on the couch and b) make the person do something out-of-the-ordinary and get reasonably good results. A simple prompt is enough.

I expect you have to rewrite your prompt template to get the best results.

Regarding the deal with OpenAI and Disney: Will we be able to create images of Disney characters? If so, when?

I got a Beholder the other day that is trademarked, I think?

Haven’t tried the Mouse yet.

One thing it’s having problem with is styles I had in DALLE3…

This is CLOSE with a reference image…

but if I try to do it with text prompt only it kinda bombs out…i’ve not played with it a lot, tho…

One thing I miss about DALLE is that you could be somewhat unique? The newer image models everything comes out super similar… I think it’s to make prompting easier for laymen? I dunno…

Progress continues!

I’ve never been too good with the ol’ image gen… but this makes it easy!!! Two thumbs up from me! Well played, OpenAI team.

BUG

Image edits endpoint refuses the snapshot model name.

Model documentation clearly states the model availability:

Plus: are you now going to be allowing dall-e-3 edits, as the error message indicates? That would be a great feature to finally deliver a working pixel-accurate edit (especially as dall-e-2 is extremely damaged from its former glory). Sending “dall-e-3” to the edits endpoint results in error an a different list of accepted models than sending an unknown model, perhaps a second-level of API model checks after the first generalization.

Here is GPT-Image-1.5 edit of a DALL-E-3 image:

Original DALL-E-3

GPT-Image-1.5 red eyes version using the edits endpoint

Prompt: Re-create the image. Change the blue light emanating from the robot’s eyes to red light. Important: You must modify the aspect ratio to fully fit the 1536x1024 resolution.

This has been flagged to the team. Thanks for raising it!

BUG

I am NOT sending the input_fidelity parameter to the edits API here!

That means it should default to “low” or TURNED OFF, NO ADDITIONAL BILL:

Yet I’m getting clobbered with billing for input fidelity: “high” on gpt-image-1.5:

image

Two images should be 194 “tile” tokens each, as seen here with a switch to image edit on gpt-image-1-mini:

image
or on gpt-image-1, the same:

image

But what I get is exactly described as high:

194+194 + 4160+4160 =

image

Additional: Sending 'input_fidelity': 'low' does not stop 'image_tokens': 4354 - that’s an additional $0.033 per input image being billed.


Also, quality:“auto” is best described as “never anything but the highest price”. Just like “detail”: “auto” on vision models.


ACTION NEEDED

Stop overbilling by 22x

Set default parameters to the defaults.

Request body:

{'model': 'gpt-image-1.5', 'image': ['<canvas image.png bytes>'], 'prompt': "Make the animals look like they've gone even more crazy!", 'size': '1024x1024', 'timeout': 240, 'output_format': 'png', 'user': 'image-editor-user', 'quality': 'medium', 'background': 'opaque'}

Usage:

{'input_tokens': 4388, 'input_tokens_details': {'image_tokens': 4354, 'text_tokens': 34}, 'output_tokens': 1427, 'total_tokens': 5815, 'output_tokens_details': {'image_tokens': 1056, 'text_tokens': 371}}


PS: enjoy an image edit remix, employing mask-drawing in the “mask” form-data parameter that was exceeded in boundary, besides exceeded cost expectation:

I am FLOORED by the long prompt adherence… 1,564 character complex prompt. Top image is GPT-Image-1, bottom is GPT-Image-1.5. It’s…nuts.

@jeffvpace :handshake:

The image with the 1.5 has a high quality, but when that’s maybe our future… I worked a lot of with DELL A, and it’s very strong developing in short time from OpenAI Team.

I hope, that’s maybe not our future as in the Terminator from James Cameron…

There was Skynet, but first in 1984.

Respekt for your work, but human should be the first :folded_hands::handshake:

human-in-the-loop Community