4o input not being cached

rick.getz · December 31, 2024, 1:54am

Similar issue here. We track our tokens from each request internally. While we are no longer seeing cached tokens, we are also seeing an increase in token usage in our dashboard that is not correct. We’ve notified OpenAI support, but I recommend you log your own usage as well. We are being charged nearly double the price.

b.silva · December 31, 2024, 2:37pm

I just received this response.

nimdraug.sael · January 1, 2025, 8:14am

Similar issue here, started from ~18 Dec.
Also we monitor cache by analyzing the usage field in completion responses. Our current metrics show approximately 80% cache hit ratio, while it’s <1% in OpenAI dashboard.

Haven’t got any response from OpenAI support yet

zoss · January 3, 2025, 12:57am

Exactly same here.

What exact date the problem has started?

For me it was 17th … 19th… and remains the same till today (January 2nd)

b.silva · January 3, 2025, 2:31pm

They keep insisting that the problem is with our code. I created a program to list the consumption of all threads and all of them contain cached tokens (about 70% should be cached). and yet there are 0 cached tokens on the dashboard. I’ve had a reasonable increase in the cost of APIs since then and I believe I’m being extra charged for this.

dsb · January 3, 2025, 3:28pm

Is OpenAI ever going to fix this? This is the same input on another platform.

dsb · January 3, 2025, 3:29pm

And this is that same input with OpenAI (not as many times, since caching isn’t working). I have no idea why the tiny sliver of caching occurs.

jim · January 3, 2025, 6:38pm

At least you’re still getting the sliver - I don’t even see that anymore - pretty sure it’s a problem with my code!

jim · January 4, 2025, 12:40am

I know I was joking before, but today’s charges were triple what I’m used to – and what I planned on based on caching from a month ago.

Is this program you wrote avail for others to use, I’d prefer not to reinvent the wheel to go back through and track everything.

AlexanderSchick · January 4, 2025, 10:00am

I have the same experience and was wondering where the small amount of caching comes from.

I then switched from cost view to activity view in the dashboard. Regarding token usage, it looks almost as if the labels have been switched (and consequently the costs).

tao_gurufocus · January 8, 2025, 7:02am

Since today(2025-01-08), it seems the cached input is back. I’m not sure. Let’s keep watching.

jim · January 8, 2025, 8:07am

I’m seeing the same. Wondering what will happen about the overages these past couple of months.

b.silva · January 8, 2025, 12:07pm

Same for me. but they didn’t recalculated the past. lets wait

kmsbernard · January 9, 2025, 4:37am

Support confirmed that the caching issue is fixed. They’re also looking into any potential overcharges, so past billing concerns might get resolved too.

AlexanderSchick · January 9, 2025, 12:12pm

Caching seems to work again for my application as well.

f-makino · January 9, 2025, 9:28pm

I just checked the dashboard. I haven’t changed the code, but it seems that the token cache is being resumed

jim · January 13, 2025, 6:15pm

Wondering where we are on this, in regards to being reimbursed for overcharges when caching was not functioning properly.

Also, when it comes to support, how are you reaching out? Just through the Intercom-bot, or something else? I used the Intercom-bot last week, but it doesn’t look like anyone has read it and I haven’t received an email.

wilsoncmonteiro2 · January 20, 2025, 6:19pm

Anyone has had this cached input going to the roof? It wasn’t there before. All pricing is required here to be reassessed. Any thoughts guys?

jim · January 22, 2025, 12:26am

They fixed caching on Jan. 8, which is what you’re seeing there (it would be ALOT higher if that orange wasn’t there (assuming that’s the cached input for you - mine is light blue)).

Support has said they are working on a way to reconcile this, but no due date. Hopefully, they can use a little bit of that Stargate money to cover overages.

bento · March 25, 2025, 6:38pm

Anyone else having caching issues again as of yesterday (March 24th)?

Topic		Replies	Views
Dashboard usage vs Prompt response usage not matching API api-usage , cost , prompt-caching	13	803	January 9, 2025
How Prompt caching works? API assistants-api , prompt-caching	17	9672	February 4, 2025
Realtime API pricing is wrong, will overcharge API realtime	36	4671	January 15, 2025
Is this a problem with cached tokens? API gpt-4 , prompt-caching	3	1382	October 10, 2024
Cache not caching more than 1024 tokens (expected: increments of 128 tokens) Bugs prompt-caching	6	379	November 14, 2024

4o input not being cached

Related topics