I use sample from documentation(https://platform.openai.com/docs/guides/vision):
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://my.site....",
"detail": "low",
},
},
],
}
],
max_tokens=10,
)
print(response.choices[0].message.content)
i use this image(512x512)
BUT!!!
In the response I see:
prompt_tokens: 303
completion_tokens: 10
total_tokens: 313
Why 303 tokens?
On the same page of the documentation it says 85 tokens for a 512x512 image and “low detail”
https://platform.openai.com/docs/guides/vision
Why 303 tokens?