How are reference images used internally, do I need to mention them in the prompt?

Let’s take this code snippet from the docs as an example:

result = client.images.edit(
    model="gpt-image-1",
    image=[
        open("body-lotion.png", "rb"),
        open("bath-bomb.png", "rb"),
        open("incense-kit.png", "rb"),
        open("soap.png", "rb"),
    ],
    prompt=prompt
)

What happens with these reference images really, internally?

I want to understand if I should talk about “reference images” in the textual prompt or if this mention is irrelevant.

Also, will the model know if I refer to any image by the filename? like… “use the same color as the soap in soap.png”.

The idea is to understand how this is working behind the scenes, to make the best out of this feature. So, any insight is welcome.

Thanks in advance!

2 Likes

i have a prompt that uses two images with the edit endpoint and in it, i explain what the different images are and how to use them together and the model does a good job of figuring it all out. it works really well!

3 Likes

Same here. As you say, the trick is first explaining the images in the prompt. Otherwise, bizzar results.

1 Like

Thanks for the replies @aaron_oblivion and @jeffvpace. I appreciate the feedback. So the idea is to refer to those images and explain what they are and how they should be used.

For curiosity, it would still be awesome to get some feedback by the people involved in the engineering team, to have a very high-level understanding of how the reference files and filenames are used in the edition process.