I’m working on a project that, at some point, needs to extract metadata from a user-supplied image. After reviewing the DALL-E API documentation, it seems like my goal might not be achievable. However, this is kind of odd, because ChatGPT allows for image uploads and provides context for the imag.
Am I looking in the wrong place? I believe this feature should be available in the API, just as it is in ChatGPT itself. Can anyone clarify this for me?
Is that not weird? Because chat interface does let your upload an image and get specific data about the image back. Do you know any alternatives to my use case?
The functionality to do step 1 via the API is not yet released, it will be released but there are no official timescales for that yet, you can do it via ChatGPT Plus with image input, so it will be that feature that gets hooked up to the API.
Thank you for your suggestion. I will dive into the github repo you send me! Just one thing, I’m somewhat new to this world and want to try to implement this in my project. Where should I start with learning this? Are there tutorials out there that put me up to speed?
My best advice on how to get started is, just do it, use git clone [URL] to clone the repo, get it up and running, and finally import the relevant bits into your project
You can ask ChatGPT to explain any errors you run into along the way; it is usually pretty good at that.