How to send image and get image as output from GPT-4o model using API

ultrapythonengineeri · April 9, 2025, 4:36am

Using chatGPT UI we can attached a reference image, add a text prompt and get the output image. How we can achieve similar using API?

Yaqoob.Zazai · June 14, 2025, 4:33pm

Hey! What you’re referring to is only possible through the ChatGPT web interface and not via the API. Unfortunately, as of now, the ChatGPT API only supports generating images from text prompts or analyzing images to extract information but it doesn’t allow using an image as a reference for generation like in the web version. it may come soon I’m waiting for it as well.

sps · June 14, 2025, 8:36pm

With gpt-image-1 you can now do this using either the Responses API or the image edits endpoint with reference images like:

import base64
from openai import OpenAI
client = OpenAI()

prompt = """
Generate a photorealistic image of a gift basket on a white background 
labeled 'Relax & Unwind' with a ribbon and handwriting-like font, 
containing all the items in the reference pictures.
"""

result = client.images.edit(
    model="gpt-image-1",
    image=[
        open("body-lotion.png", "rb"),
        open("bath-bomb.png", "rb"),
        open("incense-kit.png", "rb"),
        open("soap.png", "rb"),
    ],
    prompt=prompt
)

image_base64 = result.data[0].b64_json
image_bytes = base64.b64decode(image_base64)

# Save the image to a file
with open("gift-basket.png", "wb") as f:
    f.write(image_bytes)

_j · June 14, 2025, 8:36pm

What he’s referring to is entirely possible using the API.

However one has to submit to the ID verification where you send your government ID picture along with taking video of yourself to a third-party sketch company withpersona.com to unlock the needed AI model gpt-image-1 on the images edits API.

And in fact, while dall-e-2 can infill and outfill exactly where you have drawn an alpha channel mask, gpt-image-1 (the gpt-4o based model being discussed), can only re-imagine the image with subtle changes, and can also completely ignore lack of a drawn mask - which is its job..

Image input, and auto-mask for my API app is only to outfill, but the prompt box has an instruction to add more stuffed animals:

Edited result received based on a “reference” image (which could be the best description of the tech):

(the prompt and subject is the kind needed not to get blocked by the moderation done, also)

lucid.dev · June 14, 2025, 8:51pm

But checkout the top-right corner of your first image.

Is that a cat levitating below the tree/above the other cat on bench?

Now that’s what I call quality.

_j · June 14, 2025, 9:22pm

The input was by dall-e-3, wide.

And, but check out the fact that the flying cats are now gone, along with the lake turning into a meadow, with no lake to look at, a park bench now observing the picnic, or as seen through only a 512px input version, along with everything else different, reframed, all that is outside of the masked alpha channel area in red.

Unchanged is the 32 bit RGBA mask used with dall-e-2, sent through the mask file form field.

OpenAI continues with mistruths:

Topic		Replies	Views
Can I prompt GPT to create images with prompt+image API image-generation	5	6490	June 22, 2025
Reference Input Image to Dalle3 API API	1	2139	June 14, 2025
Can GPT-4o generate image by (image,text) prompt? API api , gpt-4o	1	707	October 3, 2024
Picture-to-picture with GPT-4o and DALL·E API does not match ChatGPT API gpt-4	2	325	July 29, 2025
A public facing web app that allows users to upload a photo of their face to be used in a generated image API	1	181	April 17, 2025

How to send image and get image as output from GPT-4o model using API

Related topics