Caching rate drop after switching to Responses API

Dobo · January 3, 2026, 2:14am

Recently switched app backend to Responses API and migrated from gpt-5 to gpt-5.1.
Seeing cache rate (% cached input tokens) dropping from 40% to 0% after this change.

What am I doing wrong? Does caching work differently for Responses API vs Chat Completions API?

Dobo · January 8, 2026, 8:25pm

Hi folks,

Just to update the thread in case it’s helpful to someone else. The caching issue was actually a bug related to the prompt.

In the prompt, I used to have the date as follows at the end of the system prompt:

- Today's date is Thu, 8 Jan 2026

A recent release changed this to include the time of day (to help the AI personalise the greeting and time-of-day references better in the response):

- Today’s date is Thu, 8 Jan 2026, 00:51:42.

This of course caused all the cache to get invalidated because now the prefix was perfectly unique with every prompt request due to including the time.

I addressed this by injecting a higher-level reference to the time of day like this:

- Today’s date is Thu, 8 Jan 2026, Evening

And now everything works fine.

_j · January 9, 2026, 12:31am

OpenAI is already injecting today’s date into these models on the API along with other unwanted nannying and “we know best” system message before your own, often false information for your own user’s location and false information about your API application.

Be sure to frame your new information about date in a non-conflicting information-building manner. “True user locale date {time}{UTC offset} - always converse using this date”

You can place a “developer message” as an immediate pre-prompt to user message, then you only lose one turn of cache, or no cache hit if you keep repeating it without history removal. The gpt-5.x AI model is already misusing and repeating back dates excessively, so this should only be employed when as important as scheduling “tomorrow” must be correct.

Topic		Replies	Views
Prompt cache: documented byte-prefix matching does not occur on gpt-5.4 / gpt-5.5 when trailing user content exceeds ~500 tokens Bugs gpt-5-5	6	307	June 27, 2026
Prompt Caching Not Working for GPT-5.4-Nano Bugs api	1	217	May 24, 2026
Prompt caching broken for GPT 5.4 and 5.5 Bugs	4	440	June 13, 2026
No Caching with model Responses API	2	1099	August 15, 2025
In_memory vs 24h caching; help please API api	3	417	February 17, 2026

Caching rate drop after switching to Responses API

Related topics