Understanding GPT-5 mini pricing confusion (total output tokens missing bug)

_j · August 13, 2025, 9:06pm

The power of AI means you don’t have to make mistakes in your math, you let the AI do that for you.

Screenshot costs

Model	Input Cost	Cached Input Cost	Output Cost
gpt-5-2025-08-07	$0.005	–	$0.198
gpt-5-mini-2025-08-07	$0.579	$0.002	$8.892
gpt-5-nano-2025-08-07	$0	–	$0.013

Report: Reconciling “Total tokens” vs. billed usage

What we computed from the billed $ amounts

Using your prices-per-million and the line items you shared, we re-derived tokens, limited by dollar figure accuracy (and verified in Python):

Model	Input	Cached input	Output	Model total
gpt-5	4,000	—	19,800	23,800
gpt-5-mini	2,316,000	80,000	4,446,000	6,842,000
gpt-5-nano	0	—	32,500	32,500
Total				6,898,300

Breakdown:

All inputs (incl. cached): 2,400,000
All outputs: 4,498,300
Grand total (input+output): 6,898,300

What the UI shows

“Total tokens” (screenshot): 2,408,624
Total requests: 451 (≈ 5,341 input tokens/request)

Reconciliation & findings

The UI’s “Total tokens” appears to be input-only
- Sum of billed input tokens we derived = 2,400,000.
- UI shows 2,408,624 → +8,624 tokens above the billed inputs.
- If the UI were counting input + output, it should be near 6.9M, not 2.4M.
  Conclusion: That widget is best interpreted as prompt/input tokens (including cached input), not total input+output.
Why UI input (2,408,624) is slightly higher than billed inputs (2,400,000)
Two effects can explain the +8,624 delta, and both are consistent with your notes:
- Free-tier input tokens are counted in the “Total tokens” widget but not in the cost line items until you overflow.
  - The +8,624 looks like a small amount of free input tokens (likely mini/nano and/or GPT input) that were consumed but not billed. (note: the AI doesn’t understand that these would all come out first if you were enrolled in data sharing for daily free tokens)
- Rounding of displayed $ amounts (shown to $0.001 precision) can alone account for a difference of this size.
  - Example sensitivities of a $0.001 change → token error:
    - gpt-5 input ($1.25/M): ±800 tokens per $0.001 (±400 with half-cent rounding)
    - gpt-5-mini input ($0.25/M): ±4,000 tokens per $0.001 (±2,000 rounding)
    - gpt-5-mini cached input ($0.025/M): ±40,000 tokens per $0.001 (±20,000 rounding)
    - gpt-5-nano input ($0.05/M): ±20,000 tokens per $0.001 (displayed $0.000 could still hide up to ~8,000 tokens)
  - Given these bounds, 8,624 is well within expected rounding noise if dollars are rounded before display.
No evidence of output tokens in that UI number
- Our derived output tokens alone are 4,498,300, which the widget clearly does not reflect.

Bottom line

Inferred tokens used (from billing):
- Inputs (incl. cached): 2,400,000
- Outputs: 4,498,300
- Total: 6,898,300
UI “Total tokens”: 2,408,624
Does it disagree?
Not once we interpret the widget as input-only tokens. The remaining +8,624 is plausibly explained by free input tokens that didn’t show in costs and/or rounding of the displayed $ amounts. Under either explanation, the small delta is reconcilable; there’s no sign of a billing inaccuracy in the data you provided.

If you can export the underlying (unrounded) dollar amounts or a per-direction token breakdown from the UI, we can pin down exactly how much of the +8,624 is free-tier vs rounding.

Topic		Replies	Views
Reasoning tokens hidden price question API api , api-billing , api-billing-problem	5	179	August 21, 2025
Trying to understand why usage cost does not match usage tokens API gpt-35-turbo , api	7	2640	February 8, 2024
Pricing, Billing and Tokens? Math is not adding up API api	9	2424	February 16, 2024
Unexpected High Token Usage on OpenAI API Community gpt-4 , chatgpt , api	1	379	January 26, 2025
Discrepancy in Token Counts Between tiktoken and API Usage for o4-mini/gpt-4o-mini Bugs api	1	220	May 28, 2025