The power of AI means you don’t have to make mistakes in your math, you let the AI do that for you.
Screenshot costs
Model | Input Cost | Cached Input Cost | Output Cost |
---|---|---|---|
gpt-5-2025-08-07 | $0.005 | – | $0.198 |
gpt-5-mini-2025-08-07 | $0.579 | $0.002 | $8.892 |
gpt-5-nano-2025-08-07 | $0 | – | $0.013 |
Report: Reconciling “Total tokens” vs. billed usage
What we computed from the billed $ amounts
Using your prices-per-million and the line items you shared, we re-derived tokens, limited by dollar figure accuracy (and verified in Python):
Model | Input | Cached input | Output | Model total |
---|---|---|---|---|
gpt-5 | 4,000 | — | 19,800 | 23,800 |
gpt-5-mini | 2,316,000 | 80,000 | 4,446,000 | 6,842,000 |
gpt-5-nano | 0 | — | 32,500 | 32,500 |
Total | 6,898,300 |
Breakdown:
- All inputs (incl. cached): 2,400,000
- All outputs: 4,498,300
- Grand total (input+output): 6,898,300
What the UI shows
- “Total tokens” (screenshot): 2,408,624
- Total requests: 451 (≈ 5,341 input tokens/request)
Reconciliation & findings
-
The UI’s “Total tokens” appears to be input-only
- Sum of billed input tokens we derived = 2,400,000.
- UI shows 2,408,624 → +8,624 tokens above the billed inputs.
- If the UI were counting input + output, it should be near 6.9M, not 2.4M.
Conclusion: That widget is best interpreted as prompt/input tokens (including cached input), not total input+output.
-
Why UI input (2,408,624) is slightly higher than billed inputs (2,400,000)
Two effects can explain the +8,624 delta, and both are consistent with your notes:-
Free-tier input tokens are counted in the “Total tokens” widget but not in the cost line items until you overflow.
- The +8,624 looks like a small amount of free input tokens (likely mini/nano and/or GPT input) that were consumed but not billed. (note: the AI doesn’t understand that these would all come out first if you were enrolled in data sharing for daily free tokens)
-
Rounding of displayed $ amounts (shown to $0.001 precision) can alone account for a difference of this size.
-
Example sensitivities of a $0.001 change → token error:
- gpt-5 input ($1.25/M): ±800 tokens per $0.001 (±400 with half-cent rounding)
- gpt-5-mini input ($0.25/M): ±4,000 tokens per $0.001 (±2,000 rounding)
- gpt-5-mini cached input ($0.025/M): ±40,000 tokens per $0.001 (±20,000 rounding)
- gpt-5-nano input ($0.05/M): ±20,000 tokens per $0.001 (displayed $0.000 could still hide up to ~8,000 tokens)
-
Given these bounds, 8,624 is well within expected rounding noise if dollars are rounded before display.
-
-
-
No evidence of output tokens in that UI number
- Our derived output tokens alone are 4,498,300, which the widget clearly does not reflect.
Bottom line
-
Inferred tokens used (from billing):
- Inputs (incl. cached): 2,400,000
- Outputs: 4,498,300
- Total: 6,898,300
-
UI “Total tokens”: 2,408,624
-
Does it disagree?
Not once we interpret the widget as input-only tokens. The remaining +8,624 is plausibly explained by free input tokens that didn’t show in costs and/or rounding of the displayed $ amounts. Under either explanation, the small delta is reconcilable; there’s no sign of a billing inaccuracy in the data you provided.
If you can export the underlying (unrounded) dollar amounts or a per-direction token breakdown from the UI, we can pin down exactly how much of the +8,624 is free-tier vs rounding.