In the documentation (https://platform.openai.com/docs/guides/images#cost-calculation-examples) there is an example that claims that a 1024x1024px image passed to GPT-4.1 should consume 1024 input tokens. However if I generate and pass an image that size to the API the response JSON claims 772 tokens were passed (see test code below). I believe that either the documentation or the calculation in the API is incorrect.
The 772 figure matches very closely to the suggested figure for GPT-4o of 765 in the docs, and in fact if I do the same test switching the model to GPT-4o I get an identical 772.
Meanwhile if I do the same test with 4.1-mini or 4.1-nano I get a result which is very close to the documented 1024x1.62 and 1024x2.46 respectively. So it seems that only GPT-4.1 deviates from the documentation, not GPT-4o, GPT-4.1-mini or GPT-4.1-nano.
Any suggestions appreciated.
Reproducible test as of today:
def generate_solid_color_png(width, height, color, output_path):
"""
Generates a solid-colour PNG.
- width, height: dimensions in pixels
- color: colour (e.g. "#RRGGBB" or (R, G, B))
- output_path: where to save the PNG
"""
image = PIL.Image.new("RGB", (width, height), color)
image.save(output_path, format="PNG")
def send_image_for_completion(width, height, color, model="gpt-4.1", detail="high"):
# 1. Create the PNG
tmp = "temp.png"
generate_solid_color_png(width, height, color, tmp)
# 2. Read & Base64-encode
with open(tmp, "rb") as f:
b = f.read()
data_url = "data:image/png;base64," + base64.b64encode(b).decode("utf-8")
# 3. Wrap it in a content block
response = client.chat.completions.create(
model=model,
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": data_url,
"detail": detail
}
}
]
}
]
)
# 4. Inspect usage
print("Prompt tokens:", response.usage.prompt_tokens)
send_image_for_completion(1024, 1024, '#00FF00')
Outputs: Prompt tokens: 772