How is prompt_cache_key actually used in API calls?

+1, it’s very inconsistent even with static prompt and cache key. I have found that for best results you need to “prime” the cache first. For example try sending 50 prompts every 2 sec. About halfway through that you start getting frequent hits, at about 80%. But the frustrating thing is it’s been impossible to figure out the optimal priming formula. Eg. 15 prompts every 10 sec sometimes seem to do it, other times not.

OpenAI, please give us more details how cache works, the docs are too vague. See Prompt caching with tools - API - OpenAI Developer Community for more questions.

1 Like