GPT-5 + Responses API is extremely slow

baki.kucukcakiroglu · August 8, 2025, 12:42pm

I’m using Agents SDK with Responses API. I was testing gpt-5 before switch and noticed that it takes around 1 minute for even a basic query.

The Traces shows that:

Step	Model	Input Tokens	Output Tokens	Duration
Triage Agent 1st call	gpt-5-nano	~3k	~2k	~9 seconds
Sales Agent 1st call	gpt-5	~5k	~2k	~40 seconds
Sales Agent 1st call	gpt-5	~9k	~1k	~20 seconds

The user query is “Hello”. The same system using gpt-4.1 family with the same prompt/context/user query takes only around 2-5 seconds.

I think this is related to Responses API because people using gpt-5 through Completions API seem to not having this problem.
—
Similar problems on other posts:

Running Agents with GPT-5 models causes arbritary long wait before ending the streaming after results have finished sending. · Issue #314 · openai/openai-agents-js · GitHub
GPT-5 is very slow compared to 4.1 - #15 by pmazumder

hieu1 · August 8, 2025, 12:47pm

Is it really crazily slow? I’ve seen lots of complaints since last night

baki.kucukcakiroglu · August 8, 2025, 12:54pm

Just changing from gpt-4.1 family to gpt-5 family caused the reponse time of a simple “Hello” query with around ~5k system prompt to go up from ~2-5 seconds to ~1 minute.

Which user would wait ~1 minutes for “Hello”?

Can you post the links of the complaints here also? We may collect all here.

mfrancis107 · August 8, 2025, 1:42pm

I think there is an infrastructure problem. If I set run in background to true and look at the status I see my requests just stuck in “queued” never switches to processing.

IAmJackHarper · August 8, 2025, 1:51pm

I also think it’s an infrastructure issue, can’t be this slow. Even with low effort it’s taking 5-6 minutes now

baki.kucukcakiroglu · August 8, 2025, 1:56pm

Are you guys using Responses API or Completions API?

IAmJackHarper · August 8, 2025, 2:00pm

I’m using responses api. Let me try completions.
PS: No it’s nearly the same, I think there are heavy issues it’s not usable this way. Positive note is that results I received with a lot of patience were really promising.

todd3 · August 8, 2025, 2:32pm

I’ve been getting response errors just about every response I make this morning. Sometimes refreshing 5+ times before Chat’s replies generate.

jonverrier · August 8, 2025, 4:34pm

I am using the Responses API. Prompts with c. 4k tokens have gone from a c. 5 seconds to 30+. I had to go through all my tests lengthening timeouts to even see what it is producing. The ones I looked at did look nice, but niceness at that cost is not worthwhile. I was unable to run a full eval run due to the slow response times. Gave up, went back to 4.1 for now.

Maybe you do need to tune the extra parameters, but at the moment I would not be able to run enough evaluation to assess the quality of the ‘less thinking’ version.

jim · August 8, 2025, 7:16pm

Not getting any errors, but every request to gpt-5 with basic medium will reason through a ton of tokens, but then no final output. This is with the ResponsesAPI.

I would occasionally get a response after several minutes last night, but now its producing nothing.

_j · August 8, 2025, 7:37pm

I got such with the open-source model.

Are you sending or missing a preset max_completions tokens that limits the output budget? It’s now for setting how much you want to pay, not how much you want to see. The finish_reason will also be “length” if the output was truncated before delivery by parameter.

There’s so many prompt tuneups to bad behavior, you could send a book of stuff for the gpt-5 model to still ignore.

Bren · August 8, 2025, 9:44pm

GPT-5 is unusable for me too (Responses API). Had to revert to 4.1. It’s super slow and I keep getting failures due to max tokens, even with verbosity set to low. Never had that happen before.

AI_issues · August 9, 2025, 12:39am

You all are leaving out information about the most important parameter when it comes to speed: “reasoning effort”

IAmJackHarper · August 9, 2025, 3:15am

I see now it’s faster but a few hours ago it was taking around 30 seconds at “minimal” and several minutes at “low”. Didn’t try the other two. I saw in playground it was thinking very slowly so I think it was a overload issue on their side

jim · August 9, 2025, 4:14am

This was super helpful - thank you.

Turned out my blank responses were because the reasoning tokens were hitting 2048 which was the apparent default output allowed. Once I bumped max_completion_tokens up to 5000 GPT-5 can’t stop talking.

Bren · August 9, 2025, 4:53pm

I had it set to minimal and even then it was producing slow, unreliable responses.

George_Sibble · August 10, 2025, 3:29pm

5 is also very slow for me and breaks tool calling.

california_mosquito · August 12, 2025, 11:15pm

Still slow for me today.

4o and 4.1 responds near instantly, where as 5 and 5 mini take around 30-60seconds to respond.

Im just using basic set up from vercel docs.

import { openai } from '@ai-sdk/openai';
import { convertToModelMessages, streamText, UIMessage } from 'ai';
import { NextRequest } from 'next/server';

export async function POST(req: NextRequest) {
  const { messages }: { messages: UIMessage[] } = await req.json();

  const system = `PROMPT...`.trim();

  const result = streamText({
    model: openai('gpt-5'),
    system,
    messages: convertToModelMessages(messages),
  });

  return result.toUIMessageStreamResponse();
}

I won’t be able to use 5 until this is resolved.

lairny · August 13, 2025, 1:58am

Why was this flagged. I love ChatGPT and I really really want this to work but using it everyday it has become unbearable. It goes back and forth. Sometimes it somewhat fast and by fast I mean at most 10 seconds for a response which is already not good to waiting for 5 minutes for a response. Do you have any idea how agonizingly painful it is when you are trying to code something. Clearly something is wrong and needs to be addressed but what worries me even more is that there is no official communication telling me that there are issues, at least I haven’t seen it. Everyone from openai is operating as if everything is great.

_j · August 13, 2025, 2:16am

Look up top:

Everything you see makes talking about the consumer ChatGPT product off-topic.
ChatGPT not a thing that API-using developers can help you improve, except to tell you to now pick “mini” over there in the new selector, for faster start of visible output.

Topic		Replies	Views
GPT-5 is very slow compared to 4.1 (Responses API) API gpt-5 , reasoning , gpt-41 , responses-api	64	33207	October 3, 2025
GPT-3.5 Turbo API response is slow API	20	12612	November 11, 2023
Slow Chat api responses ------ API	17	6626	December 24, 2023
API "gpt-3.5-turbo" Sucks (Slow) API	21	9910	December 16, 2023
GPT 4 API is Very Slow Still API gpt-4 , chatgpt , api	15	7029	December 16, 2023

GPT-5 + Responses API is extremely slow

Related topics