Image extraction using the openAI

Hi guys! I am very interested in the one specific approach. I need to upload the file and extract the data from it. The text data has to be parsed as JSON (done) and if there are some images (avatars) I have to upload them to openAI and return only links from the avatars. Can I do it somehow?

I am trying to achieve this using the openAI API Assistant

ChatGPT did that with the code interpreter, instead of “looking” at the file directly. You can engage it when you create your assistant

(https://platform.openai.com/docs/api-reference/assistants/createAssistant#assistants-createassistant-tools) (or through the UI)

2 Likes

And If I want to return it as a part of the JSON object. Like this?

"CustomArea": {
        "ProfilePicture": {
            "FileName": "zzz.png" // extracted image name,
            "URL": "https://cdn.openai.com/xxx/yyy/zzz.png" // extracted image URL
        }
    }

:thinking:

you may need to do it in multiple steps (i.e. extract text, extract image, and then combine)

what are you getting so far? It might be just a prompt thing.

1 Like

This is my prompt:
Extract the data from the attached file and return me only the JSON without any extra info using values from the file. You also can add new items to arrays. If there is no appropriate data use nullable strcture like empty string or null. If you can parse the image upload it to the openAI server. Add the received URL that can I use to download the file and name into the 'CustomArea' block. Use this structure:. But I am getting the empty JSON part or only the filename:

 "CustomArea": {
        "ProfilePicture": {
            "FileName": "",
            "URL": ""
        }
    }

Maybe you know how to get the downloading link?

Or I am getting the response like this You can download the image using the following link: [Download Profile Picture](sandbox:/mnt/data/profile_picture.jpeg).

Note: I don’t work with assistants because I think they’re dumb, so this is just off the top of my head:

I don’t think the model knows the direct link as such; you need to fetch it through the files endpoint

https://platform.openai.com/docs/assistants/tools/code-interpreter/reading-images-and-files-generated-by-code-interpreter