Reponses API vs Prompt caching

pablospe · April 17, 2025, 8:12pm

The Responses API is the new stateful API. Given this new option, when would you still use cache prompting if the Responses API can “save” the initial message/prompt?

For instance, let’s say we have a long text with many small, independent queries (perhaps one per paragraph of the long text). Using prompt caching, I would need to send the long text each time at the beginning of the prompt to access the cache, followed by the actual query for the paragraph. With the new Responses API, I would send the long text only once and then use the response.id to make queries for the individual paragraphs (in this example). It seems to me that this new stateful API is actually a better implementation of caching, offering more control. Perhaps there is an even better way of solving this example.

However, I am unclear if the initial message (when the response was created) is considered cached in terms of pricing. Perhaps this is where OpenAI is providing options, but from a technical perspective, I don’t see why one would use cache prompting when the new Responses API is available.

Topic		Replies	Views
Caching representations API	5	7287	July 13, 2023
Respones API - how does prompt caching work and its cost implications API	3	188	July 4, 2025
Options for caching same prompt x thousand of requests..? API api	4	2833	June 1, 2024
Efficient stateful completion chatbot API	10	5272	July 9, 2024
Do you cache your API results? API	5	3637	July 13, 2023

Reponses API vs Prompt caching

Related topics