Image to text description in the API?

curt.kennedy · December 5, 2023, 7:02pm

Just run it using your API credentials. Here is a simple example. Just put in your real API Key, actual image url, and what you want in System and User. Just like the chat models, lots of influence with System and User.

import requests

payload =  {"model": "gpt-4-vision-preview",
    "messages": [
     {"role": "system",
      "content": [{"type": "text",
                   "text": "You are a cool image analyst.  Your goal is to describe what is in this image."}],
     },
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in the image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://link.to.something/image.png"
            }
          }
        ]
      }
    ],
    "max_tokens": 500
  }



headers = {"Authorization": f"Bearer YOUR_API_KEY",
            "Content-Type": "application/json"}


response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r)
print(r["choices"][0]["message"]["content"])

Topic		Replies	Views
Access to GPT4 vision API API api	7	5417	February 28, 2024
Can GPT -vision models be accessed using API? API	15	1908	January 7, 2025
How do I get my API key to connect to Vision? API api-vision	4	351	March 27, 2025
How to get access to gpt-4-vision-preview? API	39	37453	February 6, 2024
GPT-4 API multimodal access (images) API	8	14116	July 2, 2024

Image to text description in the API?

Related topics