Dashboard usage vs Prompt response usage not matching

johan97 · January 2, 2025, 2:47pm

I’m relying on prompt caching heavily (according to the API response, 90% of my tokens are cached).
When I log the usage from the API responses, I get total tokens to around 50k and cached tokens at 49k. (and then I run many many prompts like that of course).

However, when I look at the OpenAI Usage dashboard, I see “uncached input pay as you go” at say 10 million tokens (for GPT4o-mini 2024-07-18). But cached input is only at like 500k.

The cached input being that low makes no sense at all if I look at my API response of usage and cached usage - if anything the OAI dashboard is inverted.

So just asking, how does the dashboard calculate the usage and how do I reconcile it with what is visible on my end from the API response?

Diet · January 2, 2025, 4:30pm

Welcome to the community!

Prompt caching seems to currently be borked, there’s a bunch of threads on this

OpenAI status (https://status.openai.com/) seems to think everything is working as normal, and I don’t see any OAI employee activity in these threads unless I’ve missed something.

Soo…

Sorry there’s not much anyone can do other than wait (or migrate to Azure) I guess

jim · January 7, 2025, 8:18am

Have there been any updates or word from OpenAI support about this? ASI is cool and all, but until then can we get the easy stuff fixed first?

Just got billed for December and was almost certainly overcharged.

sarthak.3300 · January 7, 2025, 10:56am

I don’t think so. I am also experiencing a similar issue, and I also feel that I have been overcharged.

tncintra · January 7, 2025, 3:02pm

Think this might also be related to how the usage response from completion calls account for image tokens. In my experience, image tokens are not being included in the response, so if you are using images and these are not getting cache hits, then it would make sense that the api response says you have a high cache hit rate but the dashboard says otherwise.

Check out the organization.usage endpoint! Might get you some better data there.

jim · January 7, 2025, 4:15pm

Yeah, no—Usage API reports bingo too:

Polling Usage API for completions object for a single day (https://api.openai.com/v1/organization/usage/completions ) we see zero for input_cached_tokens , yet the completion object returns for those day show cached tokens (saved when streamed or returned with

Clarity on Cache Requested

Diet · January 7, 2025, 4:26pm

maybe we can summon @VeitB with this ping here, just to get any sort of ball rolling.

jim · January 7, 2025, 5:36pm

Thank so much - appreciate it.

For reference: Just checked numbers for one day, yesterday Jan. 6, for 24 hours, and checking against usage on Dashboard:

For gpt-4o-2024-11-20 running predominantly on Assistants API:

input usage cost is 1.7x what it should be, no cached recorded
output usage is close to 1x

I do note that o1-2024-12-17 has cached input recorded.

The numbers listed on the Usage Dashboard are accurate if no cached input is taken into account for gpt-4o-2024-11-20.

I’d be OK with it if it was just a reporting/dashboard issue, but I was charged for December the same amount as reported in the dashboard for that month.

VeitB · January 7, 2025, 7:25pm

This has been brought up with staff and they are looking into this.
Thanks for confirming that the credits deducted are not aligned with a proper cache hit!

mwiederrecht · January 7, 2025, 7:50pm

I just came to the forum to report the exact same issue. My responses are reporting a VERY good cache hit rate (like 93%), but in my billing dashboard almost nothing is going to the caching category. NOT none - but like 3 cents.

Good to hear I’m not crazy. Been doubting my math for the past 30 minutes.

(Would be really great to get that cost reimbursed! I’m racking up quite the bill right now, lol.)

jim · January 8, 2025, 2:57am

LOL. Looks like they’ve got it working again …

Now to just backfill this month, December, and November (I’ll give you October )

sarthak.3300 · January 9, 2025, 10:50am

I think the issue has now been rectified-- the input tokens from the chat completion object match the dashboard numbers. Thanks for solving the issue! It would be great if the extra credits charged from previous months could be restored!

VeitB · January 9, 2025, 11:06am

Here is a post from another user who received a response from support stating that OpenAI is investigating the issue of overcharging:

I will now close this topic. Please continue the conversation in the thread linked above. This will help maintain a clear overview of how the situation is progressing.

Topic		Replies	Views
4o input not being cached API prompt-caching	42	2363	April 25, 2025
Realtime API pricing is wrong, will overcharge API realtime	36	5271	January 15, 2025
Question about GPT-5.6 API cache read/write token billing Bugs api	7	912	July 23, 2026
# of tokens used and costs randomly exploded over night API gpt-35-turbo , chatgpt	26	2989	December 8, 2023
Complimentary tokens (faster and more furiouser!): GPT-5.4-mini is charged for 1m token group rather than 10m token group Bugs	19	577	April 26, 2026

Dashboard usage vs Prompt response usage not matching

Related topics