Image to text description in the API?

Just run it using your API credentials. Here is a simple example. Just put in your real API Key, actual image url, and what you want in System and User. Just like the chat models, lots of influence with System and User.

import requests

payload =  {"model": "gpt-4-vision-preview",
    "messages": [
     {"role": "system",
      "content": [{"type": "text",
                   "text": "You are a cool image analyst.  Your goal is to describe what is in this image."}],
     },
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in the image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://link.to.something/image.png"
            }
          }
        ]
      }
    ],
    "max_tokens": 500
  }



headers = {"Authorization": f"Bearer YOUR_API_KEY",
            "Content-Type": "application/json"}


response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r)
print(r["choices"][0]["message"]["content"])
1 Like