Understanding GPT-5 mini pricing confusion (total output tokens missing bug)

Zoinks · August 13, 2025, 7:23pm

Maybe my math is off, but mini pricing with api is
2 dollars for 1m token
i have 2.5 million tokens tokens, but charged me almost 10 dollars, can someone explain this to me or is this a bug?

side not, wtf is wrong with this forum, i cant put dollar signs in my text?, comes out weird

PaulBellow · August 13, 2025, 7:28pm

Hey, @Zoinks, welcome to the forum!

The input and output tokens are calculated separately at diff prices…

Price

Input:
$1.250 / 1M tokens

Cached input:
$0.125 / 1M tokens

Output:
$10.000 / 1M tokens

Were you using reasoning?

Zoinks · August 13, 2025, 7:31pm

the prices you posted are not for MINI, i am using mini

Zoinks · August 13, 2025, 7:37pm

no, i just simply call the chatgpt api and gave it the gpt5-mini model as shown in the screenshot, u can see its using mini model for most of the amount spent, and those numbers dont align with official pricing

now i wonder how much they have been overcharging me in total using other models

PaulBellow · August 13, 2025, 7:53pm

Ah, sorry about that.

Are you logging the tokens when you make the calls? That might help explain it more.

Just using gpt5-mini, I believe reasoning is on by default (set to medium) which might be where your extra tokens are coming from?

Zoinks · August 13, 2025, 8:56pm

I am not logging tokens, , the number of tokens is fine, 2.4m, its the pricing, isnt it just 2.4m x 2.00 dollar? for output? i thought it was just that straight forward
as far as i know I am not using reasoning of any kind, just straight up calling api with that model

i was using regular 5 before mini and it seems to be same price for mini so not sure what whats going on, the screenshot should have all the output and input tokens numbers plus the pricing charge for diff models, and thats where it dont make sense, unless i am mistaken and its not straight up token numbers X output price for determining output cost

_j · August 13, 2025, 9:06pm

The power of AI means you don’t have to make mistakes in your math, you let the AI do that for you.

Screenshot costs

Model	Input Cost	Cached Input Cost	Output Cost
gpt-5-2025-08-07	$0.005	–	$0.198
gpt-5-mini-2025-08-07	$0.579	$0.002	$8.892
gpt-5-nano-2025-08-07	$0	–	$0.013

Report: Reconciling “Total tokens” vs. billed usage

What we computed from the billed $ amounts

Using your prices-per-million and the line items you shared, we re-derived tokens, limited by dollar figure accuracy (and verified in Python):

Model	Input	Cached input	Output	Model total
gpt-5	4,000	—	19,800	23,800
gpt-5-mini	2,316,000	80,000	4,446,000	6,842,000
gpt-5-nano	0	—	32,500	32,500
Total				6,898,300

Breakdown:

All inputs (incl. cached): 2,400,000
All outputs: 4,498,300
Grand total (input+output): 6,898,300

What the UI shows

“Total tokens” (screenshot): 2,408,624
Total requests: 451 (≈ 5,341 input tokens/request)

Reconciliation & findings

The UI’s “Total tokens” appears to be input-only
- Sum of billed input tokens we derived = 2,400,000.
- UI shows 2,408,624 → +8,624 tokens above the billed inputs.
- If the UI were counting input + output, it should be near 6.9M, not 2.4M.
  Conclusion: That widget is best interpreted as prompt/input tokens (including cached input), not total input+output.
Why UI input (2,408,624) is slightly higher than billed inputs (2,400,000)
Two effects can explain the +8,624 delta, and both are consistent with your notes:
- Free-tier input tokens are counted in the “Total tokens” widget but not in the cost line items until you overflow.
  - The +8,624 looks like a small amount of free input tokens (likely mini/nano and/or GPT input) that were consumed but not billed. (note: the AI doesn’t understand that these would all come out first if you were enrolled in data sharing for daily free tokens)
- Rounding of displayed $ amounts (shown to $0.001 precision) can alone account for a difference of this size.
  - Example sensitivities of a $0.001 change → token error:
    - gpt-5 input ($1.25/M): ±800 tokens per $0.001 (±400 with half-cent rounding)
    - gpt-5-mini input ($0.25/M): ±4,000 tokens per $0.001 (±2,000 rounding)
    - gpt-5-mini cached input ($0.025/M): ±40,000 tokens per $0.001 (±20,000 rounding)
    - gpt-5-nano input ($0.05/M): ±20,000 tokens per $0.001 (displayed $0.000 could still hide up to ~8,000 tokens)
  - Given these bounds, 8,624 is well within expected rounding noise if dollars are rounded before display.
No evidence of output tokens in that UI number
- Our derived output tokens alone are 4,498,300, which the widget clearly does not reflect.

Bottom line

Inferred tokens used (from billing):
- Inputs (incl. cached): 2,400,000
- Outputs: 4,498,300
- Total: 6,898,300
UI “Total tokens”: 2,408,624
Does it disagree?
Not once we interpret the widget as input-only tokens. The remaining +8,624 is plausibly explained by free input tokens that didn’t show in costs and/or rounding of the displayed $ amounts. Under either explanation, the small delta is reconcilable; there’s no sign of a billing inaccuracy in the data you provided.

If you can export the underlying (unrounded) dollar amounts or a per-direction token breakdown from the UI, we can pin down exactly how much of the +8,624 is free-tier vs rounding.

Zoinks · August 13, 2025, 9:33pm

looks like the mistake was completely mine
if i hover over the total tokens,
it lterally tells me the input token at 2.409m
and output 4.497M
and with those numbers the charge seems to be correct or at least makes more sense than the intiial 2.4m total token, I read it as total token for output/input combined. that label seems misleading, or at least i think so

PaulBellow · August 13, 2025, 9:52pm

No worries. It happens to the best and worst of us at one point or another. Haha.

I hope you stick around. We’ve got a great community growing, and a wealth of knowledge.

_j · August 13, 2025, 10:05pm

However, this report IS valid: “Total Tokens” is incorrect.

Just checked my usage.

The “Total Tokens” is only summing the selected period’s input tokens. There is no way to get it to actually include or compute the output tokens over the date range. In the bar graph appearing below that, the output tokens can only be viewed by-day by hovering, never totaled.

The UI needs a bugfix.

aprendendo.next · August 13, 2025, 10:12pm

There is a way, by clicking “chat completions” bellow in the usage page (or in this link):

Then, choose the date range and composition as you wish:

There will be a total for the whole period, above the chart.

(although the terminology “total tokens” remains wrong at the usage page)

_j · August 13, 2025, 10:15pm

Over at my place, I have a welcome mat inviting all in for safe passage.
Go into the bathroom, open the medicine cabinet, and read the post-it note on the back of the mirror that actually says you are in danger…

The information presented right up front is false with no way to rectify it.

Topic		Replies	Views
Reasoning tokens hidden price question API api , api-billing , api-billing-problem	5	990	August 21, 2025
Trying to understand why usage cost does not match usage tokens API gpt-35-turbo , api	7	2832	February 8, 2024
Pricing, Billing and Tokens? Math is not adding up API api	9	2564	February 16, 2024
Unexpected High Token Usage on OpenAI API Community gpt-4 , chatgpt , api	1	499	January 26, 2025
Discrepancy in Token Counts Between tiktoken and API Usage for o4-mini/gpt-4o-mini Bugs api	1	314	May 28, 2025