ChatGPT does not receive any image file name or location information in the DALL-E generation metadata which it receives as a report of success. There is therefore no followup method for image access and analysis except by user upload.
(edit) It does have a mount point file that python can operate on.
That confirms what I have found.
I also asked to provide the instruction it had and it answered back
The current guidelines I follow restrict me from providing direct URLs to images generated through Dall-E. When an image is created, it is displayed within our conversation interface, and you can download it directly by clicking on the image. This approach ensures a seamless and integrated user experience within the chat, without the need for external links. If you have any other requests or need assistance with different tasks, feel free to let me know!
That’s so unfortunate and limiting of all the things custom GPT could potentially achieve.
Even programming the API would take careful workings around expected patterns. Given:
the image API can return either URL or base64 image;
image URL expires after an hour;
the chat API can only accept images for vision as user role message attachments, either URL or base64 image.
The best vision results would be preprocessing and controlling the image inputs to chat completions API, by resizing and vision parameter selection.
That basically leaves programming a chat that would have several calls required to make for a request “generate an image, then analyze the contents”. We must work within model training.
provide:
image generation function, returning database ID
image analysis function, invoked by database ID query, connected to custom pattern
database function, polling for non-conversation directory contents
database that contains ID, description, prompt, rewritten prompt, image, …
method:
user input
image generation function emitted
return value with database item number for image, base64 stored
image displayed
image analysis function emitted
tool call return confirming user will supply image
user message injection “here’s the image to analyze as discussed” with base64
with the code-interpreter, could you pass the image file via a POST request? i’d obviously prefer for my server to take in an image URL instead of an image file, but at this point i’d be willing to consider either.
(i have my own api that is doing further processing on the dall-e generated image.)