Cache not always activated for requests with more than 1024 tokens

trawczynski · October 7, 2024, 11:11am

Hi, I’ve been testing the new cache feature and it doesn’t seem to work as expected.
When I send a request with a number of tokens slightly higher than the cache threshold specified in the OpenAI documentation (1024 tokens), OpenAI does not cache the tokens.
After increasing the number of tokens, the cache is activated correctly.

Documentation reference:
https://platform.openai.com/docs/guides/prompt-caching

Code:

client = openai.OpenAI(api_key='your_api_key')
text = 'Testing cache ' * 550
r = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[{"role": "user", "content": text}]
)

Topic		Replies	Views
Prompt_cache_key seems inconsistent -- works better on GPT-4o than GPT-5 API api	0	116	October 13, 2025
Issues with gpt-5 caching Bugs api , gpt-5	0	197	August 28, 2025
openAI Agent - Issue with prompt caching API gpt-4 , api	0	297	April 22, 2025
Possible Cache Issue on GPT-5-mini and GPT-5-nano Bugs gpt-5	1	174	December 17, 2025
Cachedtoken not working recently while using chatCompletion api API gpt-4 , chatgpt	1	211	January 2, 2025

Cache not always activated for requests with more than 1024 tokens

Related topics