I’m sending a properly formatted image_url request to gpt-5.4-mini via the Chat Completions API. The image is a 1920x1080 PNG (~131KB), sent as a base64 data URL with detail: high.
Per the Images and vision docs, gpt-5.4-mini uses patch-based image tokenization with a 1,536-patch budget and a 1.62x multiplier. A 1920x1080 image should cost approximately 2,400 prompt tokens.
Instead, I’m seeing ~66,000 prompt tokens. This matches almost exactly what you’d get if the base64 string were tokenized as text (~131KB PNG → ~175KB base64 → ~66K text tokens at ~4 chars/token).
Also, the prompt_tokens_details in the API response contains no image_tokens field:
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 2304
}
Shouldn’t this show image_tokens if I’m sending an image?
Request payload (base64 truncated):
{
"model": "gpt-5.4-mini",
"max_completion_tokens": 256,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What do you see? Reply in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw...",
"detail": "high"
}
}
]
}
]
}
The model does understand the image. It returns a correct description of the screenshot contents. So the vision capability works, but tokenization/billing appears to fall back to treating the base64 as plain text.
Is this a known issue with gpt-5.4-mini on the Chat Completions API?
Here’s a minimal script that reproduces this, hitting the API directly (no SDK):
#!/usr/bin/env bash
set -euo pipefail
TMPFILE=$(mktemp)
trap 'rm -f "$TMPFILE"' EXIT
BASE64_IMAGE=$(base64 -w 0 "<path_to_1080p_image>")
cat > "$TMPFILE" <<EOF
{
"model": "gpt-5.4-mini",
"max_completion_tokens": 256,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What do you see? Reply in one sentence."},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,${BASE64_IMAGE}",
"detail": "high"
}
}
]
}
]
}
EOF
curl -s -w "\nHTTP_STATUS: %{http_code}\n" \
https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${OPENAI_KEY}" \
-d @"$TMPFILE"
And the response:
thiagolobo@nephtis-desktop:~/$ ./openai.sh
{
"id": "chatcmpl-DdPgVQgzShKGIbQoqun4PyrV6zywL",
"object": "chat.completion",
"created": 1778285895,
"model": "gpt-5.4-mini-2026-03-17",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "A browser is open to the Van Zandt CAD online property tax search page with search fields and helpful hints visible.",
"refusal": null,
"annotations": []
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 65599,
"completion_tokens": 26,
"total_tokens": 65625,
"prompt_tokens_details": {
"cached_tokens": 1792,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"service_tier": "default",
"system_fingerprint": null
}
HTTP_STATUS: 200
