Difference between public and API versions of Dall-e

In the public version of Dall-e, I can attach an image and ask to make an image based on it in some other style. In the API I can use either
“Create image” (generations endpoint) where there is no image parameter. Or use “Create image variation” (variations endpoint), where there is no additional description parameter. How can I replicate the functionality of the public version via the API?

This is because the ChatGPT interface is essentially GPT-4 using dalle, not you using dalle directly.

The best way to try to replicate the ChatGPT experience would be to use gpt-4-vision-preview and do some prompt engineering/function calling to get it to call the Dalle API.

This is not simple to code though, so the easiest option might be to ask the vision API to describe it and then write the new prompt for you.

1 Like

Interesting. Perhaps you know where I can read about engineering features and how to handle Dall-e via ChatGPT?

Perhaps this can be achieved through assistants?
But I can’t find information on how to use them to connect the GPT and Dall-e modules through the API.

I’m not aware of any documentation for something like it. It would be something you’d have to design and code from the ground up.

What popped into my head is a page to upload images that are then fed to a prompt engineered gpt-4-vision-preview to then attempt to output the prompt directly into the Dalle API.

Like I said, not an easy thing to do unfortunately.

1 Like