Very slow response time with chatgpt-3.5 turbo model API

eldar_agayev · April 9, 2023, 9:49am

I need to wait for 40 seconds to get a response on gpt-3.5 chat model. Is this normal? The response is in Ukrainian. It used to be 5-10 seconds a few days ago.

leockc812 · April 9, 2023, 10:00am

Same issue in Germany. Just a “Hello” wait for 15s to response.

eldar_agayev · April 9, 2023, 10:28am

It works, but in some cases. For me, when I use this model for the actual chat with question/answer, where the chatbot is actually asking questions, not answering the response time is very crucial

jayben71 · April 9, 2023, 10:29am

No it’s not normal, OpenAI needs to fix this up soon, why is it not the same speed as the Playground?

leockc812 · April 9, 2023, 10:36am

I think for developers, we only want OpenAI to explain officially the reason for huge difference between playground and using api. And maybe a solution for the issue.
Of course, we can use some tricks to avoid bad user experience, but it won’t be the best one.

dlflannery · April 9, 2023, 10:56am

I’m timing out after one or sometimes even two minutes. I’m still on my free “grant”. Does the API adjust priority down based on that?

leockc812 · April 9, 2023, 11:15am

There are two ways to receive a response from OpenAI:

End-to-end: This generates all the words and sends the result as one response. However, this method may cause a timeout issue since most servers only wait for a response for up to 30 seconds.
Stream mode: This sends each word to the user as soon as it is generated. This is what we refer to as the “first response time.”

My personal experience with the “first response time” was less than 3 seconds just two days ago. However, currently, it takes 15 seconds or more.

leockc812 · April 9, 2023, 11:37am

In request body add stream:true to use stream mode. Otherwise, it’s normal mode.

aymantanners0 · April 9, 2023, 6:47pm

same problem , from india , Davinci works perfectly fine but 3.5turbo takes too much time to respond

bg6nwl · April 10, 2023, 7:35am

Does OpenAI have any comments on reducing speed?

hozayen · April 13, 2023, 6:20pm

The problem is worse on Mobile for me compared to PCs. It takes 90 secs on MS Edge and 150 secs on Chrome on PC/Mac. But it actually rarely spits out any results on Mobile.

Not sure if this is related to gpt-3.5-turbo. I’m in New Zealand.

eggtech · April 17, 2023, 9:58pm

Noticed this with other models too, like whisper-1. Both text-generation and speech-to-text from OpenAI has become drastically slower for me in recent weeks.

With regards to speech-to-text, the latencies with Whisper got so bad that I switched to an alternative (Deepgram) that was faster and cheaper. The results were equal, if not better than Whisper’s.

If anyone has alternative APIs to 3.5-turbo, I’d love to hear them!

jono · April 22, 2023, 12:32am

Has anyone noticed that this is different user-to-user? We send a user id with each request (this used to be a requirement, but it seems they dropped it) and certain users are consistently 4-5x slower than others. I did an apples to apples comparison, only changing the user ids on a short n=5 prompt, and is was seven seconds vs 29 seconds for different users. The slow users are also our “power users” that use the software much more regularly than others. The docs say that throttling is at the account level, so I’m not sure why this would be.

and.spbg · May 2, 2023, 2:55am

It feels like they are slowing down selected users, my speed dropped about 2 days ago. Currently, a request with ‘messages’: [{‘role’: ‘user’, ‘content’: ‘Hello!’}] takes about 20 seconds ± 3 seconds. However, for example, there is a service that works instantly : https://opchatgpt.net/chatgpt-online/

tcgumus · May 5, 2023, 1:24pm

we are seeing slowing down in our responses as well. does anyone know the cause or fix?

taivo · May 30, 2023, 9:21pm

If OpenAI has applied some throttling to your account there is probably little you can do about it (though I suspect it may just be some internal trade-off they are making between how much resources to spend on 3.5 vs 4).

But what you can do is try to reduce the number of output tokens – which has the most effect on response time. Here I wrote a list of tricks I’ve used to reduce GPT response time - I hope some of them will help.

HNazmul · October 23, 2023, 8:26am

Hey, its really painful. had to wait a long just to generate a simple 4-5 lines text.
I am from bangladesh

Topic		Replies	Views
Chat API is slow!, Fix it! API gpt-35-turbo , chatgpt , api	6	2672	December 24, 2023
Unstable speed of gpt-3.5-turbo-16k API api , gpt-35-turbo-16k , performance	6	1111	January 9, 2024
GPT-3.5 API is very slow. Any fix? API	31	9943	October 12, 2023
Chatgpt-3.5 turbo model takes long time to respond. Is there any way to speed this up? API gpt-35-turbo , api-speed	7	6586	December 19, 2023
Chat GPT's API is significantly slower than the website with GPT Plus API	35	36907	December 12, 2023

Very slow response time with chatgpt-3.5 turbo model API

Related topics