Conflicting Info About the Cost of detail:low images

meltem.seyhan · March 1, 2024, 12:59pm

Hi,
In the following document, there is a conflicting info about the cost of detail:low images. In one place it says 65 tokens, in some other place it says 85 tokens.

https://platform.openai.com/docs/guides/vision

_j · March 1, 2024, 3:26pm

Yes, I don’t know where they got that “65” from. Perhaps mis-remembering.

The minimum tokens for an image is 85. The overhead of the first message is 7 tokens.

Tiny request with some images to ask about (but we don’t ask, just upload) = 92 prompt tokens.

from openai import OpenAI
client = OpenAI()
example_images = [
'iVBORw0KGgoAAAANSUhEUgAAAIAAAABACAMAAADlCI9NAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAAZQTFRF////MzMzOFSMkQAAAPJJREFUeNrslm0PwjAIhHv//09rYqZADzOBqMnu+WLTruOGvK0lhBBCCPHH4E7x3pwAfFE4tX9lAUBVwZyAYjwFAeikgH3XYxn88nzKbIZly4/BluUlIG66RVXBcYd9TTQWN+1vWUEqIJQI5nqYP6scl84UqUtEoLNMjoqBzFYrt+IF1FOTfGsqIIlcgAbNZ0Uoxtu6igB+tyBgZhCgAZ8KyI46zYQF/LksQC0L3gigdQBhgGkXou1hF1XebKzKXBxaDsjCOu1Q/LA1U+Joelt/9d2QVm9MjmibO2mGTEy2ZyetsbdLgAQIIYQQQoifcRNgAIfGAzQQHmwIAAAAAElFTkSuQmCC',
'iVBORw0KGgoAAAANSUhEUgAAAIAAAABACAMAAADlCI9NAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAAZQTFRF////AAAAVcLTfgAAAPRJREFUeNrsllEKwzAMQ+37X3owBm0c2VZCIYXpfXVBTd9qx5uZEEIIIcQr8IHjAgcc/LTBGwSiz5sEoIwTKwuxVCAW5XsxFco3Y63A3BawVWDMiFgiMD5tvELNuh/r5sA9Nu1yiYaXvBBLBawUAGubsZU5UOy8HkNvINoAv27nMVZ1WC1wfwrspPk2FDMiVpYknNu6uIxAVWQsgBoSCCQxI2KEANFdXccXseZzuKMQQDFmt6pPwU9CL+CcADEJr6qFA1aWYIgZEesGEVgmTsGvfYyIdaPYwp6JwBRL5kD4Hs7+VWGSz8aEEEIIIYQQ/8VHgAEAxPsD+SYeZ2QAAAAASUVORK5CYII='
]
user_tiled_image_message = [
  {
    "role": "user",
    "content": [
      {
        "type": "image_url",
        "image_url": {"url": f"data:image/png;base64,{example_images[0]}", "detail": "low"}
      },
    ]
  }
]

response = client.chat.completions.with_raw_response.create(
    model="gpt-4-vision-preview",
    messages=user_tiled_image_message,
    max_tokens=10, top_p=1e-19, temperature=1e-29,
)

#print(response.http_request.content.decode())   #"request" object
print(response.http_response.content.decode())  #"response" object
print(response.elapsed.total_seconds())

A response with image description when only sending an image:
{“id”: “chatcmpl-123456789”, “object”: “chat.completion”, “created”: 1709306528, “model”: “gpt-4-1106-vision-preview”, “usage”: {“prompt_tokens”: 92, “completion_tokens”: 10, “total_tokens”: 102}, “choices”: [{“message”: {“role”: “assistant”, “content”: “The image displays the word "Apple" in a”}, “finish_reason”: “length”, “index”: 0}]}

Another array method also has the same minimum per image.

darcschnider · March 5, 2024, 8:16pm

you missing the flag for low res processing.

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4-vision-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
            "detail": "high"
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0].message.content)

low will enable the “low res” mode. The model will receive a low-res 512px x 512px version of the image, and represent the image with a budget of 65 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that do not require high detail.
high will enable “high res” mode, which first allows the model to see the low res image and then creates detailed crops of input images as 512px squares based on the input image size. Each of the detailed crops uses twice the token budget (65 tokens) for a total of 129 tokens.

_j · March 6, 2024, 12:10am

I’m not missing the “detail” parameter. Scroll right in the code box.
You posted the documentation that is in error, about “65 tokens”.

You can still feel free to try to get an image described for under 90 prompt tokens by whatever mechanism you think could do that, though…

darcschnider · March 6, 2024, 8:05pm

ah my bad. I totally did not scroll to the end and though it was missing below haha. thanks for pointing that out.

Topic		Replies	Views
300 tokens for a 512x512 image at "detail: low" API chatgpt , api , gpt-4o	0	112	February 11, 2025
GPT 4 Vision API - Detail param and token cost? API	6	4528	November 8, 2023
Unexpected Vision Pricing Bugs gpt-4 , api	1	900	May 9, 2024
What is the token cost for image prompt in GPT-4o? Prompting gpt-4 , token , image-reading	2	3010	June 6, 2024
Why Responses API use more Tokens? API api , threads , assistants-api , api-threads , responses-endpoint	3	131	April 6, 2025

Conflicting Info About the Cost of detail:low images

Related topics