Image to text description in the API?

Hi

I’ve been using some other image to text models out there.

I have been really amazed by the image description feature of chatgpt.

I understood in yesterday’s keynote that the feature would finally be available in the API. looking at the documentation this morning, I do not find it…

Did I miss something?

1 Like

Welcome to the forum.

The Learn how to use GPT-4 to understand images page should help…

1 Like

:sweat_smile:
thanks. I does cover my needs.

Can’t understand how I managed to have missed it despite scrolling through it twice

Hi PaulBellow, I checked your link and can see the model “gpt-4-vision-preview”.

After checking in my playground, I am not able to see the specific vision version. Is there any reason? Please see below the screenshot.

Thanks for your support. Edin

To see all models you can use, not just in Playground, goto:

Playground > Settings > Limits

Then click Show All Models, should get something like this:

I don’t see a Vision model variant in Playground, but it is available in the API, and in this list.

Update: Weird, I see it in Playground after selecting Completion and then scroll to Chat

Anyway, it’s best to just use the Playground > Settings > Limits and see what you all have. The UI in Playground might be a bit glitchy.

I don’t know you can even use Vision in the Playground now, since I don’t see a URL field to point the model to an image. But I had to go to “Complete” first to even get it to list.

3 Likes

Thank you curt.kennedy!

Now I can see it, but how you mentioned already, just in the Playground “Complete” and not having the possibility to link an image.

I tried it also over “Assistant” with GPT-4 model (because the vision model was not listed) and was able to attached the image but get an error message.

1 Like

Just run it using your API credentials. Here is a simple example. Just put in your real API Key, actual image url, and what you want in System and User. Just like the chat models, lots of influence with System and User.

import requests

payload =  {"model": "gpt-4-vision-preview",
    "messages": [
     {"role": "system",
      "content": [{"type": "text",
                   "text": "You are a cool image analyst.  Your goal is to describe what is in this image."}],
     },
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in the image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://link.to.something/image.png"
            }
          }
        ]
      }
    ],
    "max_tokens": 500
  }



headers = {"Authorization": f"Bearer YOUR_API_KEY",
            "Content-Type": "application/json"}


response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r)
print(r["choices"][0]["message"]["content"])
1 Like

It seems you want ChatGPT Plus:

Go to https://chat.openai.com/?e=irni - $20 USD/month.

The low quality of following prompts now seems typical. This is “ethnically Filipino Asian” for you: