Dashboard usage vs Prompt response usage not matching

I’m relying on prompt caching heavily (according to the API response, 90% of my tokens are cached).
When I log the usage from the API responses, I get total tokens to around 50k and cached tokens at 49k. (and then I run many many prompts like that of course).

However, when I look at the OpenAI Usage dashboard, I see “uncached input pay as you go” at say 10 million tokens (for GPT4o-mini 2024-07-18). But cached input is only at like 500k.

The cached input being that low makes no sense at all if I look at my API response of usage and cached usage - if anything the OAI dashboard is inverted.

So just asking, how does the dashboard calculate the usage and how do I reconcile it with what is visible on my end from the API response?

1 Like

Welcome to the community!

Prompt caching seems to currently be borked, there’s a bunch of threads on this

OpenAI status (https://status.openai.com/) seems to think everything is working as normal, and I don’t see any OAI employee activity in these threads unless I’ve missed something.

Soo…

Sorry :grimacing: there’s not much anyone can do other than wait (or migrate to Azure) I guess :confused:

2 Likes

Have there been any updates or word from OpenAI support about this? ASI is cool and all, but until then can we get the easy stuff fixed first?

Just got billed for December and was almost certainly overcharged.

1 Like

I don’t think so. I am also experiencing a similar issue, and I also feel that I have been overcharged.

Think this might also be related to how the usage response from completion calls account for image tokens. In my experience, image tokens are not being included in the response, so if you are using images and these are not getting cache hits, then it would make sense that the api response says you have a high cache hit rate but the dashboard says otherwise.

Check out the organization.usage endpoint! Might get you some better data there.

Yeah, no—Usage API reports bingo too:

Polling Usage API for completions object for a single day (https://api.openai.com/v1/organization/usage/completions ) we see zero for input_cached_tokens , yet the completion object returns for those day show cached tokens (saved when streamed or returned with

Clarity on Cache Requested

2 Likes

maybe we can summon @vb with this ping here, just to get any sort of ball rolling.

Thank so much - appreciate it.

For reference: Just checked numbers for one day, yesterday Jan. 6, for 24 hours, and checking against usage on Dashboard:

For gpt-4o-2024-11-20 running predominantly on Assistants API:

  • input usage cost is 1.7x what it should be, no cached recorded
  • output usage is close to 1x

I do note that o1-2024-12-17 has cached input recorded.

The numbers listed on the Usage Dashboard are accurate if no cached input is taken into account for gpt-4o-2024-11-20.

I’d be OK with it if it was just a reporting/dashboard issue, but I was charged for December the same amount as reported in the dashboard for that month.

1 Like

This has been brought up with staff and they are looking into this.
Thanks for confirming that the credits deducted are not aligned with a proper cache hit!

2 Likes

I just came to the forum to report the exact same issue. My responses are reporting a VERY good cache hit rate (like 93%), but in my billing dashboard almost nothing is going to the caching category. NOT none - but like 3 cents.

Good to hear I’m not crazy. Been doubting my math for the past 30 minutes.

(Would be really great to get that cost reimbursed! I’m racking up quite the bill right now, lol.)

1 Like

LOL. Looks like they’ve got it working again …

Now to just backfill this month, December, and November (I’ll give you October :slight_smile: )

2 Likes

I think the issue has now been rectified-- the input tokens from the chat completion object match the dashboard numbers. Thanks for solving the issue! It would be great if the extra credits charged from previous months could be restored!

2 Likes

Here is a post from another user who received a response from support stating that OpenAI is investigating the issue of overcharging:

I will now close this topic. Please continue the conversation in the thread linked above. This will help maintain a clear overview of how the situation is progressing.

1 Like