Prompt Caching seems not working even if long common prefix in the system prompt

Y.J_Zheng · September 19, 2025, 6:48pm

Hey, I saw many discussions about prompt caching, but when I try to use a common prefix for a better cache hit, it seems to fail every time.
Not quite sure what I am doing wrong, below are two cost insights of gpt-5 api response

I appended the same context about 7000 tokens in the prefix of the system prompt

I have also used diff checker for checking the difference, it shows that I have a very long common prefix in the first system prompt message.

I have also tried to use different models like gpt-4.1, gpt-4.1-mini, gpt-5, none of them hit the cache.

I think I must be doing something wrong. Does anyone have same issue or suggestion to this problem?

Y.J_Zheng · September 19, 2025, 7:04pm

brandonlu0924 · September 21, 2025, 2:10pm

same problem, I’m not sure why is prompt caching not working??

luli.yanng · September 23, 2025, 6:29am

same issue , i have test gpt 4.1 gpt5 and gpt 4o via openai python sdk, and the 4.1 not working at all. Even i try put the system prompt in instructions or concat with user prompt in input, all not work. but in the same test,only gpt5 always hit the prompt caching , so i think it may be the models problem.

Topic		Replies	Views
How Prompt caching works? API assistants-api , prompt-caching	17	10133	February 4, 2025
Prompt caching doesn't seem to work regularly API api , prompt-caching	4	965	July 13, 2025
How to improve caching accuracy API api , prompt , prompt-caching	1	432	July 8, 2025
Prompt_cache_key seems inconsistent -- works better on GPT-4o than GPT-5 API api	0	212	October 13, 2025
Prompt caching not working API prompt-caching	10	1597	March 4, 2026

Prompt Caching seems not working even if long common prefix in the system prompt

Related topics