From my understanding DALLE-3 can’t take images as input but I was using the ChatGPT web interface (not the API) and two examples made me question this.
In both of these examples there are details in the generated images that are from the original image but weren’t described at all so I’m confused. For example, the dogs harness and the house has things like the windows being circular. Any explanation would be helpful!
It is a chat AI.
It has a DALL-E 3 tool that it can send text to.
It has a GPT-4 vision component where it can view and analyze images.
Thus, it can describe what it saw on the night in question to the sketch artist.
DALL-E 3, like DALL-E 2 on the API, does have the ability complete areas of an image, outfill beyond. However, the only way this is exposed is within ChatGPT Plus, where you can only input previous AI-generated images from the same chat, and is not a feature of the API, where you could possibly make a competitive product at 8 cents a try.