I’ve been using some other image to text models out there.
I have been really amazed by the image description feature of chatgpt.
I understood in yesterday’s keynote that the feature would finally be available in the API. looking at the documentation this morning, I do not find it…
Anyway, it’s best to just use the Playground > Settings > Limits and see what you all have. The UI in Playground might be a bit glitchy.
I don’t know you can even use Vision in the Playground now, since I don’t see a URL field to point the model to an image. But I had to go to “Complete” first to even get it to list.
Now I can see it, but how you mentioned already, just in the Playground “Complete” and not having the possibility to link an image.
I tried it also over “Assistant” with GPT-4 model (because the vision model was not listed) and was able to attached the image but get an error message.
Just run it using your API credentials. Here is a simple example. Just put in your real API Key, actual image url, and what you want in System and User. Just like the chat models, lots of influence with System and User.
import requests
payload = {"model": "gpt-4-vision-preview",
"messages": [
{"role": "system",
"content": [{"type": "text",
"text": "You are a cool image analyst. Your goal is to describe what is in this image."}],
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in the image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://link.to.something/image.png"
}
}
]
}
],
"max_tokens": 500
}
headers = {"Authorization": f"Bearer YOUR_API_KEY",
"Content-Type": "application/json"}
response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
r = response.json()
print(r)
print(r["choices"][0]["message"]["content"])