Possible Cache Issue on GPT-5-mini and GPT-5-nano

Hello everyone,
I wanna report something I been testing these last days, because maybe is helpful for other devs too.

I made a big batch of tests (around 30 runs) using different models: gpt-5.1, gpt-5-nano, and gpt-5-mini.
And right now, only GPT-5.1 is showing consistent caching behavior.

When I tested GPT-5-mini and GPT-5-nano, the cache almost never hits.
For example:
I tried 20 prompts with exact same system input, and only 2 requests got cached successfully.
The rest came with cached_tokens = 0 or just looked like the model didn’t even try to use cache at all.

I was expecting these smaller models to use caching a lot more, since the announcement said they should be more optimized. But in real practice, looks like something is not working correct.

Not sure if this is a general bug, or if it’s just happening for some users.
If anyone else is facing same issue, would be nice to know.

Also, for people who are NOT having this issue:
did you guys change something on your API config, or add some parameter that makes cache hit more often?
Just trying to understand if I’m missing something on my setup.

And if @OpenAI_Support team can check this behavior, it would help a lot. Because for production apps, the cache is super important (especially for nano/mini where cost and speed matters).

2 Likes

I can confirm that the caché is broken for all models I tested except gpt-5.1. Not even using prompt_cache_key saves them.

This are my results after 120 tests each:

gpt-5.2
no prompt_cache_key: 45% failure rate.
with prompt_cache_key: 30% failure rate.

gpt-5.1

no prompt_cache_key: 19% failure rate.
with prompt_cache_key: 6% failure rate.

gpt-5-mini

no prompt_cache_key: 72% failure rate.
with prompt_cache_key: 76% failure rate.

gpt-5-nano

no prompt_cache_key: 25% failure rate.
with prompt_cache_key: 28% failure rate.

And more problematic is that using prompt_cache_key, doesn’t have any effect at all in mini and nano, making them even slightly worse…

3 Likes