DALLE-3 API Takes Images as Inputs or Not?

I want to use my picture as an input to generate more variations, for example, me walking on the street, me eating food in a hotel room, etc. Can we achieve this using the DALL.E 3 API?

Hi!
Currently only GPT-4 accepts images as inputs, so you cannot iterate over an image you uploaded using Dalle-3 at this time.

What you could do is ask GPT-4 to create a description of your image and have Dalle create a new version based on the description.

1 Like

To add to what @vb mentioned, the “edit” and “variations” endpoints aren’t available for DALLE3 API yet, but I imagine they’re coming soon.

1 Like

Hey,

I am also waiting for the ability to input images via Dalle3 API.
Actually I am developing an app and I am using Dalle3 as an image generator for the app users. I want to provide them with the ability to upload images and can create different variations of the uploaded image.

I am sure this wont come. I see the development of DALLE is going in a direction that you CAN NOT create anything that does not look complete like A.I. created. 85 of 100 images I receive are just simple colorful paintings, and not photos.
So: If you give it a photo as input, the output would only be a “painting” style like variation that might imitate the style a little bit. Simply because they do not want you to
a) Clone Persones (you upload a Donald Trump image and tell him to eat sh…)
b) Clone any copyrighte painting and tell it to add a feature…like a recent painting and add a “flower” somewhere, because that would be a copyright infringment (to change another person copyright)

So, this wont come.

You can use DALLE to create comics…thats it for now.

So, to be clear, is image-to-image generation not supported by OpenAI? Whether that’s DALL.E 3 or another service?

The only OpenAI service that can base a picture on the imagery of another picture (without an intermediate step of purely text language), is the variations endpoint.

It doesn’t accept a prompt to guide the alteration of an image. You just get a close re-imagining through the limited abilities of DALL-E 2.

You upload the image to use as the basis for the variation(s). Must be a valid PNG file, less than 4MB, and square.

Usage:

from openai import OpenAI
client = OpenAI()
response = client.images.create_variation(
  image=open("fatdancer.png", "rb"),
)
print(response.data[0].url)

Input:

Output: