Image to text description in the API?

fourchette · November 7, 2023, 5:46am

Hi

I’ve been using some other image to text models out there.

I have been really amazed by the image description feature of chatgpt.

I understood in yesterday’s keynote that the feature would finally be available in the API. looking at the documentation this morning, I do not find it…

Did I miss something?

PaulBellow · November 7, 2023, 5:49am

Welcome to the forum.

The Learn how to use GPT-4 to understand images page should help…

fourchette · November 7, 2023, 5:54am

thanks. I does cover my needs.

Can’t understand how I managed to have missed it despite scrolling through it twice

Edin · December 5, 2023, 6:21pm

Hi PaulBellow, I checked your link and can see the model “gpt-4-vision-preview”.

After checking in my playground, I am not able to see the specific vision version. Is there any reason? Please see below the screenshot.

Thanks for your support. Edin

curt.kennedy · December 5, 2023, 6:25pm

To see all models you can use, not just in Playground, goto:

Playground > Settings > Limits

Then click Show All Models, should get something like this:

I don’t see a Vision model variant in Playground, but it is available in the API, and in this list.

Update: Weird, I see it in Playground after selecting Completion and then scroll to Chat

Anyway, it’s best to just use the Playground > Settings > Limits and see what you all have. The UI in Playground might be a bit glitchy.

I don’t know you can even use Vision in the Playground now, since I don’t see a URL field to point the model to an image. But I had to go to “Complete” first to even get it to list.

Edin · December 5, 2023, 6:50pm

Thank you curt.kennedy!

Now I can see it, but how you mentioned already, just in the Playground “Complete” and not having the possibility to link an image.

I tried it also over “Assistant” with GPT-4 model (because the vision model was not listed) and was able to attached the image but get an error message.

curt.kennedy · December 5, 2023, 7:02pm

Just run it using your API credentials. Here is a simple example. Just put in your real API Key, actual image url, and what you want in System and User. Just like the chat models, lots of influence with System and User.

import requests

payload =  {"model": "gpt-4-vision-preview",
    "messages": [
     {"role": "system",
      "content": [{"type": "text",
                   "text": "You are a cool image analyst.  Your goal is to describe what is in this image."}],
     },
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in the image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://link.to.something/image.png"
            }
          }
        ]
      }
    ],
    "max_tokens": 500
  }



headers = {"Authorization": f"Bearer YOUR_API_KEY",
            "Content-Type": "application/json"}


response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r)
print(r["choices"][0]["message"]["content"])

_j · April 1, 2024, 4:37pm

It seems you want ChatGPT Plus:

Go to https://chat.openai.com/?e=irni - $20 USD/month.

The low quality of following prompts now seems typical. This is “ethnically Filipino Asian” for you:

Topic		Replies	Views
Access to GPT4 vision API API api	7	3013	February 28, 2024
Can GPT -vision models be accessed using API? API	15	1237	January 7, 2025
How do I get my API key to connect to Vision? API api-vision	4	91	March 27, 2025
How to get access to gpt-4-vision-preview? API	39	36718	February 6, 2024
GPT-4 API multimodal access (images) API	8	13008	July 2, 2024

Image to text description in the API?

Related topics