How is prompt_cache_key actually used in API calls?

Pawe · September 13, 2025, 11:25pm

+1, it’s very inconsistent even with static prompt and cache key. I have found that for best results you need to “prime” the cache first. For example try sending 50 prompts every 2 sec. About halfway through that you start getting frequent hits, at about 80%. But the frustrating thing is it’s been impossible to figure out the optimal priming formula. Eg. 15 prompts every 10 sec sometimes seem to do it, other times not.

OpenAI, please give us more details how cache works, the docs are too vague. See Prompt caching with tools - API - OpenAI Developer Community for more questions.

Topic		Replies	Views
Understanding "prompt_cache_keys" in query efficiency API prompt , prompt-caching	5	867	November 12, 2025
Prompt caching with multiple agents API	1	1153	October 9, 2024
Prompt_cache_key seems inconsistent -- works better on GPT-4o than GPT-5 API api	0	143	October 13, 2025
Using same prompt_cache_key in multiple parallel conversations API	3	145	January 11, 2026
Prompt caching doesn't seem to work regularly API api , prompt-caching	4	809	July 13, 2025

How is prompt_cache_key actually used in API calls?

Related topics