DALLE-3 Help Needed (attempted gen_id/image_id)

DALLE-3 Help

Hi

I’ve managed to refine a prompt that can generate an image of a glass with different types of juice in each.

Here’s the prompt:

“Create a photorealistic image of a single, clear glass with a matte finish, placed centrally against a white background. The glass should be evenly filled with {}, without any overflow. The glass should take up exactly one-fourth of the image’s width and its height should be proportional, maintaining the same dimensions across all images. The photograph must be taken from directly above, with a camera angle fixed at 45 degrees, ensuring the glass’s rim forms a perfect circle in the center of the frame. Soft, uniform lighting should accurately capture the texture and color of the {}, with no shadows around the glass.”

The {} will include the type of juice for a list of products I’ve curated.

When running this using the API, I unfortunately do not get consistent images. The glass is either sometimes random shaped, photo taken closely or too distant and the texture can be very different.

I am generating these for my ecommerce store, I have compared this to another competitor who does this exact same thing, but all of their images are symmetrical. They use DALLE-3 also so I’m not quite sure if it’s due to my prompt being poor or not. The reason I know they use it is because it’s extremely close to the images produced by DALLE-3

of course I tried some methods like maybe using a reference picture before the image is generated, but DALLE-3 does not support it

Any suggestions?

You never will.

Using the web interface (ChatGPT) will also differ.

Using a ref_id will produce better similar images, but there are always “things different”.


Red wine


Orange juice


Milk


Coffee


I did reference the first image, but you also have - at least - describe the look of the glass (in terms of shape).

1 Like

Appreciate your reply again but i would like to show you something, do you have discord by any chance? So i can give you more information on why this may be possible

Tip : I learned to “lower” my expectations.

Dall-E is not Photoshop (DINP!).

So it will never create 100% accurate things like a person would do who is creating images by hand / with a camera, or whatever.

It is amazing technology, I can’t even work without after hardly one month of a paid subscription.

But you have to “learn” and “understand” what it can do and what it can’t.

Now 9 out of 10 images are (for me) really usable, because I lowered my threshold (not in terms of quality, but in term of “what do you expect from a tool like this”).

1 Like

Nope, sorry :slight_smile:

I want to link a website but links are not allowed here, however on that website all images on this website is generated by DALLE-3

I understand, you can send DM through the system here.

But in general “exact the same image, but with some differences” is very difficult to create.

Even the exact same image is not 100% possible, even when referring an existing #ID.

There is always some numeric noise.

And as far as I know (I can be wrong) the API does not support the gen_id and referencing_images_id ?

For what I have seen, the “competitor” is not using Dall-E and maybe even not AI at all.

But those are my personal findings, based on the (example) image meta-data and visual look and feel.

The most important thing for creating “consistent / symmetrical” images, is that the API of Dall-E does not support the gen_id.

It does show the parameter, but you can’t set it. So every new image is different from the image before.

When you try the same thing via the chat interface of GPT, you can set the reference_image_id to the gen_id with much better “same looking” results.


Example

For example, a basic plate with “stuff” on it.

All those three images use the exact same prompt (besides the “stuff”) and same reference_image_id.

As you can see, the plate / shadow / material / etc… is “almost” the same. But using the API, this is not possible (because of the lack of the referencing_image_id).


Solution

  1. Render an empty plate
  2. Render plates with stuff
  3. Mask the “stuff” in Photoshop
  4. Merge the “stuff” with the empty plate

This way you have the exact same plate / shadow, but with different “stuff”.

Quick example, done in 15 seconds.

1 Like

Yeah, to go along with @Foo-Bar 's excellent advice, you want to craft the prompt so that most of it stays the same except for the “focus”… ie what’s on the place in that example.

I have heard the DALLE3 team is hard at work on improving and that more control is on their wishlist too.

Great to hear.

And it’s not a “bad” thing to do manual corrections, I think.

Afterall Dall-E is a tool, not a solution.

Sometime it creates an image in 2 seconds and then I am making adjustments to it that cost me about 2 hours (changing composition, removing parts, adding parts, make things sharper / more blurry, etc…).

But when I had to create the image from scratch, it would cost me 2 days…

So for me, this is an excellent tool, boosting my productivity and creativity to space and beyond.

1 Like

Do you know if gen_id has been supported in 2024?