API "gpt-3.5-turbo" Sucks (Slow)

DxZ · April 8, 2023, 3:07pm

Hello good morning, I am working on an application and I am trying to integrate gpt-3.5-turbo. The big issue I’ve noticed is that “gpt-3.5-turbo” drags to give an answer, and it doesn’t matter if you have the streaming mode option enabled. With a simple question, it takes between 15 to 40 seconds.

Of course, I’ve already ruled out the possibility that my network or my application is the problem, since I’m testing it from the operating system terminal. I even bought a dedicated OVH server in the United States to rule out a latency issue, and it still returns the same response times.

To further debug what was going on, I changed the language model to (Davinci and Curie) and the server responses are practically instantaneous. This means that the problem is not mine it’s yours (OpenAI), and it has been like this for over a week.

I am surprised that with the money that OpenAI raise from the investments they receive, they don’t buy better servers. I even bought ChatGPT Plus, i notice that gpt-3.5-turbo goes 10x faster. I suspect that OpenAI purposely slowed down the “gpt-3.5-turbo” API so that we use more expensive models in our applications, as they make much more money from users buying ChatGPT Plus compared to the “gpt-3.5-turbo” API.

UnDev23 · April 8, 2023, 3:34pm

I am experiencing the same!
Few days ago, I was getting responses from GPT-3.5-turbo very quickly as compared to what I am getting since yesterday. The response time have dramatically increased from 5-8s to more than 20s for the same prompt. I also tried using text-davinci-03 model and it’s significantly faster and more accurate than 3.5-turbo but it’s significantly costlier too.
I’m disappointed that 3.5-turbo has slowed down so much that it’s practically unusable user experience wise.

PaulBellow · April 8, 2023, 3:53pm

Please keep in mind that ChatGPT went from dozens to 100+ million users in two months… the biggest growth in all of human history for any app.

That they’re keeping the servers up at all is tremendous in my eyes.

Be patient. I’m very sure the reliability will improve as time goes on.

DxZ · April 8, 2023, 4:15pm

The thing is, as a customer, I’m not interested in waiting weeks or months for this issue to be resolved, since I only pay for it to work.

What bothers me the most is that the same model (GPT3.5) works 10 times faster on (https://chat.openai.com/) and not on the private API. What’s going on, OpenAI?

Takimoti · April 8, 2023, 4:53pm

Same issue here. My telegram bot currently receive so many time out errors with “gpt-3.5-turbo”. I’m looking for the error in my side and turns out , it’s the api issue. When I changed the model, the issue is solved. Hopefully, openai resolves this api outage problem quickly.

agrawal.sumit17 · April 9, 2023, 8:10am

Same issue, the gpt-3.5-turbo API is too slow

N2U · April 9, 2023, 10:34am

That’s a really quick and user friendly solution! Thanks for sharing

jochenschultz · April 22, 2023, 1:13am

Caching good answers for common questions might help as well (e.g. if you want to collect commands for machines).

Or for OCR pipelines you might let GPT-3 build some templates (like back in the good old days) e.g. for invoices. If another invoice of the same client comes up and it is possible to gather all data with that you might not even need the api for every request.

okku · May 24, 2023, 8:16am

Well this is still a major issue, a simple 180 token response takes 30 seconds! We seen response times up to 60 sec.

We planned to released our product now but this makes it impossible to use, the users think its broken.

We also got rate limited, we are nowehere near our rate limits of 3500 rpm and 90000 tpm.

yan · May 24, 2023, 10:17am

Same here, it is fxcking slow!!!
Beyond that, it often breaks. Respond that the server is overloaded, and result my program exits.

However, when I use chatGPT, which model under the hood is also gpt-3.5-turbo runnig really fast!! It’s ridiculous!!!

foundwonder · May 24, 2023, 11:01pm

I am experiencing the same issue. Several days ago it took on average 40s to respond, and yesterday it went up to at least 80s. Today somehow, after couple of hours of errors, it went down to about 60s. Still unbelievably slow. The same prompts take ChatGPT webpage version only up to 10s.

bgokce · May 24, 2023, 11:28pm

I think there is a general problem. Many people, including us, are experiencing the same issue. The OpenAI team is not offering any solutions. I believe they don’t value developers.

jwatte · May 24, 2023, 11:50pm

I believe they literally can’t rent enough hardware to meet demand. For example, none of the North American AWS regions have current-generation GPU instances available for provisioning. I would assume Azure is in the same boat, because demand goes where resources are available.

DavidOS366 · June 1, 2023, 6:14pm

I think The response time is proportional to the contents of your Prompt or your tasks inside the prompt + AI in itself will probably hallucinate on your requests. I have observed that if response doesn’t come back with a minute or 60 seconds time frame, it had probably started to hallucinate, i.e. make something up, unrelated to your requests.

A good adequate response window is somewhere between 30 to 40 seconds. I mean I don’t think anyone should expect instantaneous response from the APIs. Eg. The task I am doing is difficult to do by algorithms because it is not scalable so I have the AI do for me instead.

alex.friedmann · June 2, 2023, 10:07am

Same here gpt-4 is even slower. So it’s unfortunately unusable for us now. Bard’s API is also in beta and only available via waiting list. Open source to the rescue!

sono_matteo · June 2, 2023, 12:42pm

I have the same problem. Over the past weeks the response time went up and my application has now poor performance. It is a conversation based app.

jochenschultz · June 2, 2023, 10:22pm

Any experiences with the waitlist on azure hosted openai models?

bill.french · June 3, 2023, 11:14pm

Noticing the same for GPT-003 (davinci), although, with less impact. Still, roughly 20% slower since ~25-April. Read this chart from right to left ;latest requests are on the left. We used to hover around 4.4 seconds response time. Now we’re at 5.9s average.

bocchesegiacomo01 · June 4, 2023, 9:11pm

Exactly … it’s much faster on chatgpt

gabriel_jorge · June 21, 2023, 10:34pm

Hey Bill, what’s that tool, seems interesting?

Topic		Replies	Views
How to speed up OpenAI API calls Community api	31	34264	December 13, 2023
🐢 : GPT4 extremely slow on GPT4 API and ChatGPT API gpt-4 , chatgpt	19	4287	April 1, 2025
Slow Chat api responses ------ API	17	6397	December 24, 2023
GPT-3.5 API is very slow. Any fix? API	31	9862	October 12, 2023
GPT 4 API is Very Slow Still API gpt-4 , chatgpt , api	15	6698	December 16, 2023

API "gpt-3.5-turbo" Sucks (Slow)

Related topics