Can I cache large chunks on gpt-5-nano?, Does each cache-read request reset cache inactive time?, Does large caches affect cache overflow limits?

aprendendo.next · October 25, 2025, 11:17am

Welcome to the community.

a) yes b) probably, but it isn’t guaranteed c) last activity

But rather than taking the words of other people on this, I suggest testing with your own data. Some things to notice:

Some settings like using the instructions parameter instead of an input role might break the caching on Responses API
Responses API is very erratic with caching, failing or taking a few minutes to take effect. Not very deterministic right now. Chat completions API currently offers more stable caching results. You will find several threads about this around here.
If saving costs is important for you it might be a good thing to log your requests (at least timestamp, request id and usage), so that you can monitor and check later if caching failed for something you missed or the caching service failed for some of the non deterministic reasons mentioned earlier.

Edit: I’ve noticed caching has been deteriorated even with chat completions lately, so I’ve run some more detailed tests and opened a separated new thread to keep up with this.

Topic		Replies	Views
In_memory vs 24h caching; help please API api	3	328	February 17, 2026
Prompt Caching Not Working for GPT-5.4-Nano Bugs api	1	142	May 24, 2026
Prompt caching with tools API prompt-caching	1	742	September 15, 2025
How does Prompt Caching work? Prompting api , prompt-caching	8	8817	October 29, 2024
Prompt caching doesn't seem to work regularly API api , prompt-caching	4	1040	July 13, 2025

Can I cache large chunks on gpt-5-nano?, Does each cache-read request reset cache inactive time?, Does large caches affect cache overflow limits?

Related topics