How to use GenID in Dall E API?

Hi there, i am building a language learning and therapy tool for ESL teachers and Speech therapist. I am using API to call DallE for text to image. The purpose is to help kids with speech delay to learn language with images. When using web front end, i can quote ‘gen_id’ to tell DallE to generate similar images ‘in the style of’ previous one. However this cannot be achieved when i using API.
I even asked ChatGPT itself to give me the syntax and the attached image is the result. However, it seems this doesn’t work! Can anyone shed some light?

This is important because for speech delay kids, it’s crucial that the delta of images should be minimised. i.e. when learning ‘a happy Mum’ v.s. ‘a sad Mum’, the Mums should stay the same so that the concept of happy v.s. sad can be clearly convayed…

2 Likes

Yeah, it’s not possible on API… yet.

I think the new GPT-4o (omni) model might be able to do it as its multi-modal… ie you could send an image and say make more like this…

1 Like

Does this method also work in dalle3? Only seeing dalle2 can accept image through Creating image variation

Your question is of a different topic: can DALL-E 3 accept input images and be used on the API for variations or edits.

The answer is currently no.

Like the gen_id seed parameter in a ChatGPT conversation, edits to existing pictures with DALL-E is something OpenAI has only offered in ChatGPT and not on the API, and only for previous AI-generated images in a chat. The capability is demonstrated on OpenAI’s own product but not offered to developers.

Thank you for your reply.

My scenario involves inputting a photo of a person (labeled as 1) and a prompt to generate a comic portrait (labeled as 2), followed by using the same thread to input a background prompt to create comic 3 based on image 2, and repeating this step to produce images 4-6, ultimately forming a series of four comics with different backgrounds, how can we ensure continuity of the same character in images 3-6?

Previously, using ChatGPT, this could be achieved by utilizing the ‘gen_id’ from image 2.
Based on your response, it appears that dalle3 api does not support image-to-image generation.
Currently using the API, I plan to input 1 to generate a descriptive text (model=‘gpt-4o’), which I will then provide to dalle3 to draw a comic (model=‘dall-e-3’). This comic will be included in the message history, add a prompt, and generate image 3. I will follow the same steps for images 4-6.

Are there any steps in this process that can be omitted or optimized? Thank you very much!