Understanding GPT-5 mini pricing confusion (total output tokens missing bug)


Maybe my math is off, but mini pricing with api is
2 dollars for 1m token
i have 2.5 million tokens tokens, but charged me almost 10 dollars, can someone explain this to me or is this a bug?

side not, wtf is wrong with this forum, i cant put dollar signs in my text?, comes out weird

1 Like

Hey, @Zoinks, welcome to the forum!

The input and output tokens are calculated separately at diff prices

Price

Input:
$1.250 / 1M tokens

Cached input:
$0.125 / 1M tokens

Output:
$10.000 / 1M tokens

Were you using reasoning?

1 Like

the prices you posted are not for MINI, i am using mini

no, i just simply call the chatgpt api and gave it the gpt5-mini model as shown in the screenshot, u can see its using mini model for most of the amount spent, and those numbers dont align with official pricing

now i wonder how much they have been overcharging me in total using other models

Ah, sorry about that.

Are you logging the tokens when you make the calls? That might help explain it more.

Just using gpt5-mini, I believe reasoning is on by default (set to medium) which might be where your extra tokens are coming from?

1 Like

I am not logging tokens, , the number of tokens is fine, 2.4m, its the pricing, isnt it just 2.4m x 2.00 dollar? for output? i thought it was just that straight forward
as far as i know I am not using reasoning of any kind, just straight up calling api with that model

i was using regular 5 before mini and it seems to be same price for mini so not sure what whats going on, the screenshot should have all the output and input tokens numbers plus the pricing charge for diff models, and thats where it dont make sense, unless i am mistaken and its not straight up token numbers X output price for determining output cost

The power of AI means you don’t have to make mistakes in your math, you let the AI do that for you.

Screenshot costs

Model Input Cost Cached Input Cost Output Cost
gpt-5-2025-08-07 $0.005 $0.198
gpt-5-mini-2025-08-07 $0.579 $0.002 $8.892
gpt-5-nano-2025-08-07 $0 $0.013

Report: Reconciling “Total tokens” vs. billed usage

What we computed from the billed $ amounts

Using your prices-per-million and the line items you shared, we re-derived tokens, limited by dollar figure accuracy (and verified in Python):

Model Input Cached input Output Model total
gpt-5 4,000 19,800 23,800
gpt-5-mini 2,316,000 80,000 4,446,000 6,842,000
gpt-5-nano 0 32,500 32,500
Total 6,898,300

Breakdown:

  • All inputs (incl. cached): 2,400,000
  • All outputs: 4,498,300
  • Grand total (input+output): 6,898,300

What the UI shows

  • “Total tokens” (screenshot): 2,408,624
  • Total requests: 451 (≈ 5,341 input tokens/request)

Reconciliation & findings

  1. The UI’s “Total tokens” appears to be input-only

    • Sum of billed input tokens we derived = 2,400,000.
    • UI shows 2,408,624+8,624 tokens above the billed inputs.
    • If the UI were counting input + output, it should be near 6.9M, not 2.4M.
      Conclusion: That widget is best interpreted as prompt/input tokens (including cached input), not total input+output.
  2. Why UI input (2,408,624) is slightly higher than billed inputs (2,400,000)
    Two effects can explain the +8,624 delta, and both are consistent with your notes:

    • Free-tier input tokens are counted in the “Total tokens” widget but not in the cost line items until you overflow.

      • The +8,624 looks like a small amount of free input tokens (likely mini/nano and/or GPT input) that were consumed but not billed. (note: the AI doesn’t understand that these would all come out first if you were enrolled in data sharing for daily free tokens)
    • Rounding of displayed $ amounts (shown to $0.001 precision) can alone account for a difference of this size.

      • Example sensitivities of a $0.001 change → token error:

        • gpt-5 input ($1.25/M): ±800 tokens per $0.001 (±400 with half-cent rounding)
        • gpt-5-mini input ($0.25/M): ±4,000 tokens per $0.001 (±2,000 rounding)
        • gpt-5-mini cached input ($0.025/M): ±40,000 tokens per $0.001 (±20,000 rounding)
        • gpt-5-nano input ($0.05/M): ±20,000 tokens per $0.001 (displayed $0.000 could still hide up to ~8,000 tokens)
      • Given these bounds, 8,624 is well within expected rounding noise if dollars are rounded before display.

  3. No evidence of output tokens in that UI number

    • Our derived output tokens alone are 4,498,300, which the widget clearly does not reflect.

Bottom line

  • Inferred tokens used (from billing):

    • Inputs (incl. cached): 2,400,000
    • Outputs: 4,498,300
    • Total: 6,898,300
  • UI “Total tokens”: 2,408,624

  • Does it disagree?
    Not once we interpret the widget as input-only tokens. The remaining +8,624 is plausibly explained by free input tokens that didn’t show in costs and/or rounding of the displayed $ amounts. Under either explanation, the small delta is reconcilable; there’s no sign of a billing inaccuracy in the data you provided.

If you can export the underlying (unrounded) dollar amounts or a per-direction token breakdown from the UI, we can pin down exactly how much of the +8,624 is free-tier vs rounding.

1 Like

looks like the mistake was completely mine
if i hover over the total tokens,
it lterally tells me the input token at 2.409m
and output 4.497M
and with those numbers the charge seems to be correct or at least makes more sense than the intiial 2.4m total token, I read it as total token for output/input combined. that label seems misleading, or at least i think so

1 Like

No worries. It happens to the best and worst of us at one point or another. Haha.

I hope you stick around. We’ve got a great community growing, and a wealth of knowledge.

1 Like

However, this report IS valid: “Total Tokens” is incorrect.

Just checked my usage.

The “Total Tokens” is only summing the selected period’s input tokens. There is no way to get it to actually include or compute the output tokens over the date range. In the bar graph appearing below that, the output tokens can only be viewed by-day by hovering, never totaled.

The UI needs a bugfix.

There is a way, by clicking “chat completions” bellow in the usage page (or in this link):

Then, choose the date range and composition as you wish:

There will be a total for the whole period, above the chart.

(although the terminology “total tokens” remains wrong at the usage page)

Over at my place, I have a welcome mat inviting all in for safe passage.
Go into the bathroom, open the medicine cabinet, and read the post-it note on the back of the mirror that actually says you are in danger…

The information presented right up front is false with no way to rectify it.