GPT-4 extremely slow compared to 3.5

Hi there! I’m migrating from 3.5 to 4, but it’s extremely slow. I’m putting all my responses through a relay service, which has a timeout of 30 seconds. Normally this wouldn’t be a problem, but with GPT4 it often happens even small requests under 1000 tokens will take longer than that.

Any idea if this is a currently known issue? Is there anything we can do?


We can only hope it will get faster. Experiencing the same problem

Yeah, it seems to be a kind of indirect view into how poor the Microsoft Azure infrastructure services supporting OpenAI actually are.



From a business perspective it’s a tough sell when people can experience GPT4 through ChatGPT blazingly fast.

GPT3.5 was significantly faster than 3.0, so I really hoped GPT4 would be even better. So it’s not only 29x more expensive to use compared to 3.5, but at least 2x slower as well.

I really hope they’ll fix this soon and make it at least fast enough so requests will fit within the 30second window. ChatGPT can do it in 1-3 seconds already. So looks-like it’s a priority choice they made.

1 Like

Might be a current problem with whole infrastructure overload, as on my side 3.5turbo is almost unresponsive as well …

Kind of seems not surprising, as I can imagine the challenge behind tackling scalability in such a short time frame with this popularity and number of users of both chat and API. But then again yeah, with Azure behind it should be better… lets hope they manage it and fix it …


I’m having the slowness issue as well. So, let’s get this straight. With the free version, it was quite snappy but it was prone to hallucinations and server overload. The paid API gets released, no more server overloads for 3.5. Then GPT4 comes out and I don’t have the problem with not being able to access the api but it’s insanely slow? What’s going on here?

Current outage check latest status HERE!


the same call with the same data can take up to 4 times slower than 3.5 turbo, this is insane.


Same here. Is OpenAI planning to update/improve their server or API? No one of our customers would like to want to wait more than 30 seconds for the chat response when it is moved to prod :sweat_smile:

Same problem. GPT-4 through API is either extremely slow or not working at all. Practically useless. I tried it for the last 5 days. GPT-3.5-turbo is working well, but does not provide the same qaulity as GPT-4.

I just tried to switch to GPT-4 but it’s so slow that I couldn’t even consider using it right now.

Completely agree. It’s a real shame, because even despite the quality jump, waiting this long for a response is not acceptable to customers (especially ones that are paying). In terms of UI & responsiveness, even 10s and people start to think something has gone wrong / they’re stuck. Will keep testing each day, praying for an improvement :slight_smile:

I have the same problem. 4.0 is extremely slow, and what’s worse is that it seems like the default 3.5 is doing a lot more wrong than before 4.0 was released.

same problem ~ gpt4 no response after 120 s !

Just FYI, here is what I did to solve this problem.

I turned on the “stream” option and set it up in the back-end. With this option, at least I could check the progress of the outputs, similar to how ChatGPT works. So, I am okaysh now. :smile:

1 Like