I’m working on a project that, at some point, needs to extract metadata from a user-supplied image. After reviewing the DALL-E API documentation, it seems like my goal might not be achievable. However, this is kind of odd, because ChatGPT allows for image uploads and provides context for the imag.
Am I looking in the wrong place? I believe this feature should be available in the API, just as it is in ChatGPT itself. Can anyone clarify this for me?
The functionality to do step 1 via the API is not yet released, it will be released but there are no official timescales for that yet, you can do it via ChatGPT Plus with image input, so it will be that feature that gets hooked up to the API.
Thank you for your suggestion. I will dive into the github repo you send me! Just one thing, I’m somewhat new to this world and want to try to implement this in my project. Where should I start with learning this? Are there tutorials out there that put me up to speed?