“Improved reasoning quality and cache utilization when compared to Chat Completions.”
It would seem to me that instead, Responses fails in anything related to “cache utilization”.
Reasoning items that are run again are input before a user input turn, but then are dropped at the whim of OpenAI when no longer around tools - says the documentation.
Instant cache break of an input turn when dropping before it. Or never dropping and you have a context 3x as large as the output turns, forever, because of the internal reasoning.
Then we have that using either “conversations” or “previous response ID” as stateful “chat” storage is completely unmanaged in length. You can run it up to the maximum, and then you only have your choice of an error, or cache-breaking every turn when unknown context is dropped.
Nothing on the endpoint is aware of cache persistence and expected expiry to know when to elide a conversation back to a budget, or by how much by model. It just grows until failure.
Instructions: not a dynamic preprompt or post-prompt. You have a cache-killer there also. You have no such mechanism to place late turns non-permanently (except for OpenAI’s own tool system message injections before and after user input to break cache and developer intentions again). Forget RAG placed where it needs to go, in a nonexistent role for it.
“Better reasoning and lower costs” is purely hypothetical. Unless a developer does state management themselves. And avoids tools. And then controls the language instructions of their own functions.
IMO A working tuned application has no reason to leave Chat Completions, unless you also want 3x the network bandwidth streamed back at you for the savings in sending. I would not over-promise. And unlike Responses, you can switch chat AI providers with your self-managed state the second you get a timeout.
Maybe, on Responses, make “conversations” not fail to store anything in “background” if you’ve got some coding time over there, friends. And make the gpt-5 cache mechanism work to discount anything at all, unless your motivation is to break discounting on Chat Completions completely.