Unexpected Token Discrepancy in GPT-4o Mini Vision Billing vs. API Usage

I recently tested GPT-4o Mini with Vision capabilities, and I noticed a significant discrepancy between the API response usage tokens and the billing tokens in my OpenAI account.

What Happened?

  • The API response reported:
"prompt_tokens": 490,
"completion_tokens": 169,
"total_tokens": 659

However, in the OpenAI billing section, it showed:

Input: 8539 tokens.
Output: 169 tokens.
Total: 8708 tokens.

The output token count is identical, but the input token count is drastically different.

What I Expected vs. What I Got

  • Based on OpenAI’s vision pricing formula, I estimated my image input would cost ~765 tokens per image (assuming high-res processing).
  • Instead, the actual billing token count was far higher than expected.

My Setup:

  • Model: gpt-4o-mini
  • Image Size: High-Resolution (likely triggering 765 tokens per image)
  • Number of Images: One (not multiple)
  • API Call: Single request with an image and text input

Questions:

  1. Why is the billing input token count so much higher than the API usage response?
  2. Are there hidden processing costs or extra tokens that OpenAI doesn’t report in API responses?
  3. Has anyone else experienced similar token discrepancies with GPT-4o Mini Vision?
  4. Is there a way to accurately estimate the token cost before making API calls?

I appreciate any insights from the community or OpenAI staff. Thanks! :raised_hands:

I am having exact same issue. 4o-mini, vision, playground.
But in my case the difference huge. Playground shows approx 800 tokens where as actual consumption is 25k tokens.

Were you able to research it further? Also, the tokens are correct if I go through documentation as billing doc shows my image (920x1200) will take 25k tokens. But it should be shown on plaground as well.

The image token consumption of gpt-4o-mini is multiplied 33.33x that of gpt-4o. It is actually costing you twice as much.

The pricing amplification ensured images were no cheaper vs gpt-4o-2024-05-13, but then gpt-4o-2024-08-06 halved the price.

The pricing page calculators show you the cost, and that identical images are more expensive on -mini. They just don’t inform how that cost is applied - by more tokens.

With detail:high, you can only truly estimate images you send yourself, not remote URL ones that you have no resolution information about. With images sent as base64, you can do some clever resizing to ensure you don’t pay 50% more for 10 more pixels and two more high-detail tiles.