GPT-3.5 API is very slow. Any fix?

I have a plus subscription and also use the api. For the plus subscription, in the chat gpt ui, i never had any issues with GPT-3.5 (only with GPT4, being slow sometimes).

But i created a simple python script to generate some short responses based on my prompts (taken from a google sheet or csv) and it’s really slow. It’s taking around 30-60 seconds per request, while in the chat ui it’s instant.

And it usually times out - if i have large file with 10-20 entries, it doesn’t get to finish it in one go, it times out or crashes in any other way (at one point it gave me a message related to cloudflare).

Is anyone having similar issues? Is there any way to fix them?


Can you share an example of what you are trying to do? Issues like these are usually intermittent and due to high load on the servers.

yes, indeed, it’s intermittent and probably due to high load on the servers, but it’s always very slow. i mean 30-60 second response time on the api when the chat ui is almost instant…

1 Like

yes is very slow lately, Im also having the same issues.


I also have the same problem. The service responds very quickly when it is first started, but after a period of disuse, it becomes very slow when used again.

1 Like

Is it possible that the GPT server is overloaded during the day, affecting the API’s response? However, the response of the dialogue through the GTP Plus UI is very fast

1 Like

Dear OpenAPI,
when are you going to fix the very slow response time?


I’m finding that it’s taking around 90sec currently for one completion with GPT-3.5. Anyone know if it’s any faster with the Azure OpenAI Service?

I have just the same experience. My ChatGPT Pro is fast (also the legacy bot), but my paid API subscription uses aprox 40-50 seconds on 550 tokens. And it has been like this some weeks now.


Here is a screen recording of gpt 35 turbo and gpt 4 from the playground:

I have a site in production that is suffering badly from the slow gpt 35 turbo API.


We have developed simple UI connected via API with ChatGPT 3.5-turbo, and it stopped providing output couple days ago, before it was quite OK. Input queries did not change, could you assist on where to start looking for in order to fix things?

I’ve noticed this as well. A real big increase in response times with more failures with status code 524.

Same here. Ridiculous response times for the API. No solutions yet?

I am working on project using API. The response is so slow that it is basically useless.

1 Like

If you try tulp with my OPEN_API_KEY, it is way slower than using

tulp is almost a flat request using the Python OpenAI API:



My service was working well and for the past weeks it’s been a slippery slope to the point that now I don’t know if I can keep running the service…

Can I just pay extra and have it working less than 60 seconds… ?

Could you find some solution for the slow response from GPT 3.5 API?

Inference times for the various models can change with time of day and week as load on the system varies and it should in general trend downwards and more compute is added, but it is important to understand this is a shared resource and there may be differences in performance over time.

However, I am consistently experiencing very slow responses for each request I am submitting (typically around 90 sec for generating some 1000 tokens). What should I do in this case? Any thought.