GPT-3.5 API is very slow. Any fix?

trackscatsteelskylab · October 10, 2023, 7:25am

Getting incredibly slow responses (~ 34 seconds) when generating 300 tokens with GPT 3.5 Turbo API.

The same prompt through ChatGPT 3.5 is about 1 second.

This a PLUS user account and I’ve also paid for API credits, if that matters.

Foxalabs · October 10, 2023, 8:59am

There have been no significant deviations in response generation time in the past 7 days, any effects you are experiencing must be local to you, either a local instance issue or edge server problems.

willramos12 · October 11, 2023, 7:00pm

Thanks for sharing that, where can I access that? I’m having the same issue, all of a sudden my 3.5 turbo response times are taking 3x longer than what they did just 72 hours ago.

Foxalabs · October 11, 2023, 7:06pm

You can find this super useful site over at https://openai-status.llm-utils.org created by of our very own forum members.

I imagine that any slowdown, if it is indeed caused by a server issue, will be addressed quickly, it can take time to both detect and resolve these issues, making use of help.openai.com to report issues can also ensure that tracking and issue monitoring gets notified.

shekhargulati123 · October 11, 2023, 8:56pm

I am also seeing slow response time for gpt-3.5-turbo API calls. The graph is also showing there was latency peak

Foxalabs · October 11, 2023, 9:10pm

Latency peeks for a short period are common for almost all remote API services world wide, these can be local issues, text environment issues, actual service issues and a whole host of connectivity problems.

Applications that make use of remote API’s should always make the assumption that the endpoint is unresponsive and have suitable error handling and methods such as retries with exponential backoff and ensuring that any blocking calls are done in their own threads to allow monitoring, and if required update the user with progress.

fluffypanda · October 11, 2023, 10:26pm

Seeing the same issue, GPT-4 response for the same query is about 40 seconds whereas GPT-3.5-turbo has been consistently around 2.5 minutes. Been testing every day since Friday.

_j · October 11, 2023, 10:51pm

Python test code to run (and slowness measure confirmed by another)

I’m still doing well, compare the “latency” of 1 token to a full response of 512:

Title
[1 tokens in 1.0s. 1.0 tps]
Title: Embracing Digital Transformation: Unlocking the Power of the Digital Age

[128 tokens in 1.9s. 67.6 tps]
Title: Embracing Digital Transformation: Unlocking the Power of the Digital Age

[512 tokens in 7.2s. 70.8 tps]

post-pay, Western US.

Unlike other reports of massive slowing in the last few days:

So this is not a “blame on intermittent stuff and the user”.

Although it does appear to be “sticky” to particular users. Reports of where you are geographically connected, whether you are prepay or billing or free trial, whether you ever paid a bill, etc. could help determine why some are affected and some are fast.

Foxalabs · October 11, 2023, 11:07pm

Yup, there has 100% been an uptick in the number of people with the same complaint of slow performance, but given the number of members, that number seems fairly small, so it’s either a geographic issue with a particular node or something else that is, as you mention, account based… maybe?

As I say on every one of these kinds of problem reports, please send them to help.openai.com as that will at the very least get the issue in front of an AI looking for commonalities, similar things happen here, but doubling up of visibility will not be a bad thing for awareness.

fullstack · October 12, 2023, 6:10am

We’ve just reported a problem. We were having about 3x/4x slowdown with gpt-3.5-turbo over the last few days. We tried other 3.5 models, and they are all the same (London based)

b0zal · October 12, 2023, 9:11pm

its rate limit maybe
works fine for me with maximum of rate limit (Thanks openai)

_j · October 12, 2023, 10:10pm

I’m fast, and have had no need to request higher rate limits, so your idea doesn’t seem to correspond with which users are experiencing performance concerns, in my case at least. There’s “my customers are complaining” tiers of forum API users that are affected.

It could still be other unexplored facets of particular accounts that bring on the slow output production.

Topic		Replies	Views
Chat GPT's API is significantly slower than the website with GPT Plus API	35	35759	December 12, 2023
Let's compare our API speed? It's too slow! API gpt-4 , gpt-35 , gpt-35-turbo , api , playground	9	1771	October 29, 2023
GPT-3.5 API is 30x slower than ChatGPT equivalent prompt API gpt-35-turbo , api	69	13669	November 30, 2023
GPT-3.5 Turbo API response is slow API	20	11989	November 11, 2023
Very slow response time with chatgpt-3.5 turbo model API API	17	10844	December 19, 2023

GPT-3.5 API is very slow. Any fix?

Related topics