GPT-4o and GPT-4o-mini Cache Not Working?

Hi everyone,

Today I noticed that the cache is not working for me with both GPT-4o and GPT-4o-mini. My code is exactly the same as before. The initial prompts I’m using are long enough to trigger the cache on successive calls, and caching was working correctly before.

Is anyone else experiencing the same issue? Any insights would be appreciated!

Thanks!

1 Like

Hi @davidia,

Thanks for reporting.

I just tested it, and it seems to be working on gpt-4o but not on gpt-4o-mini.

1 Like

Thanks for checking! I tested again, and caching now works with GPT-4o—not sure if it just started working or if I missed it before. But GPT-4o-mini is still not working.

Just tested it today (18/11/2025) and GPT-4o-mini Cached Input is still not working.

First updated the openai library to the latest version and try this it will work.

response = openai.ChatCompletion.create(
model=“gpt-4o-mini”,
messages=[{“role”: “system”, “content”: “static instructions…”}, {“role”: “user”, “content”: “Your question”}],
prompt_cache_retention=“in_memory”
)

Closed because this is a topic from February 2025