Stateful Responses API Much Slower Than Chat Completions

stevecoffey · September 3, 2025, 4:27am

Hey all! Steve here from the OpenAI API Eng team. We hear you on the latency issues when using previous_response_id. We are working to optimize our database to make this as fast as possible. For the fastest possible latency, we’d recommend using store: false. In this mode, you’d roundtrip all items (like chat completions), and we skip the database so there is no latency hit looking up a previous response.

Topic		Replies	Views
GPT-5 + Responses API is extremely slow API gpt-5 , responses , responses-api	33	18561	October 27, 2025
Slow Chat api responses ------ API	17	6773	December 24, 2023
Responses API... not highly responsive (& what about assistants)? API gpt-4 , responses , responses-api	3	359	January 12, 2026
Completions API Suddenly slow API gpt4o	4	993	October 15, 2024
Assistant API Performance is Very Slow API plugin-development , api	11	5605	December 29, 2025

Stateful Responses API Much Slower Than Chat Completions

Related topics