The documentation is a fib to cause overspending.
Omission of the detail parameter, or setting it to “auto”, on a 256x256 web image is billed 255 tokens, the tiled price. Not 85 tokens.
The AI does see the image twice at high
, and the second “viewing” costs you twice as much for whatever reason.
Low vs High:
The image depicts a stylized, cartoon-like drawing of a peeled banana lying on a wooden board. The banana is peeled halfway, with the skin peeled back and hanging loosely. The banana itself is yellow with some brown spots, indicating ripeness. The wooden board appears flat and simple, with a light brown color and visible wood grain texture. The background has a textured appearance, resembling a canvas, which gives the image an artistic feel. The overall style of the image is playful and whimsical.
time: 5.50s
The image depicts a stylized, pixelated representation of a peeled banana resting on a wooden board. The banana is shown with its peel partially removed, exposing the fruit inside. The texture and colors used in the image give it a somewhat rough, embroidered appearance, suggesting it might be a digital artwork designed to mimic the look of a tapestry or a pixel art style. The background is textured and has a gradient of pinkish hues, enhancing the artistic feel of the image.
time: 12.20s
Neither seems particularly distinguished. Neither identifies “board” as a couch. “high” makes false conclusion about the background texture.
Trials code with 256px image URL
from openai import OpenAI
client = OpenAI()
for detail in ["low", "high"]:
response = client.chat.completions.with_raw_response.create(
model="gpt-4-turbo",
max_tokens=500,
top_p=0.01,
messages=[{
"role": "user",
"content": [
{"type": "text", "text":
"Describe attached image content exhaustively."},
{
"type": "image_url",
"image_url": {
"url": "https://i.imgur.com/kxxQZIh.png",
"detail": detail,
},
},
],
}],
)
print(response.http_response.json()["choices"][0]["message"]["content"])
print(response.http_response.json()["usage"])
print(f"time: {response.elapsed.total_seconds():.2f}s")
Fun test for you: For the cost of one detail:high
, send the image three times at low, even alternating the rescaling you do yourself. See how the AI performs with multiple takes on the same image in context.