The error message of "That model is currently overloaded with other requests. " using gpt-3.5-turbo

lilyyuanzhi · May 23, 2023, 6:50pm

I am a pay-as-you-go user and use ChatGPT API to process a csv file containing thousands of rows of short paragraphs using a python program. The program will always process around 20-30 rows succesfully then stop with an error message:

‘message’: 'That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID XXXX in your message.

I have tried many different times and everytime this happens at some point.

I checked status.openai.com, and its says all models fully operational. What can I do at this point?

lraus · May 23, 2023, 7:08pm

the rate limits are pretty generous - 3,500 requests per minute / 350k tokens per minute on gpt3 models as long as you’ve had the account for more than 48 hours. but the models get that from time to time anyway and it doesn’t tend to make it to the status page!

if however you are on gpt4 api, the limits drop to circa 200 requests per minute - which model are you using?

try adding a throttle to the request maybe?

gabriele.devito · May 23, 2023, 8:15pm

I have the same issue. I’m using the GPT3.5 model with the python API. I’ve already written a couple of times to the Help Center, but nothing. In the last few days it has been impossible using the model: 40/50 seconds in the best case to obtain a response and 9 out of 10 times the horrible message 'The model is currently overloaded…".
It is impossible to use the API and I can’t do anything. Please, provide a “CONCRETE” response and please, resolve this situation once and for all.
We are customers, and we are paying (pay-as-you-go).
Thank you in advance.

lilyyuanzhi · May 23, 2023, 9:30pm

I use gpt-3.5-turbo. For now, I find a temporary solution by adding a retry logic in my code. This way, at least the program itself won’t be interrupted and will continue to process the rest of the data.

result = response.json()
if ‘choices’ in result:
address = result[‘choices’][0][‘message’][‘content’]
return address
else:
retries += 1
print(f"Request failed. Retrying ({retries}/{max_retries})…")
time.sleep(2 ** retries) # Exponential backoff delay

jwatte · May 23, 2023, 9:40pm

My guess is that, unless you have usage in the $5k/month and up level, the “paying customer” argument isn’t very strong.

Here’s the thing, though – I’m doing an enterprise use case, not a consumer use case. I’d be happy to pay 10x what I pay now, and a monthly base charge, if I could get enterprise level support and some performance guarantees … but I think OpenAI is in startup mode, where moving quickly on product is more important than building the most possible revenue.

lraus · May 24, 2023, 12:28pm

I think you are spot on with your assessment here - we do have to all remember that they are still doing all of this in Beta and whilst we are paying for it, they have no SLA’s and the service seems to be provided as-is.

I enjoy using OpenAI and the Azure implementation but for enterprises cases, I’m leaning more towards the other options or even self hosting - depends on the budget but I’d be afraid to launch anything commercial on the back of a beta platform with no support as the end users don’t like it when we blame a service that has no backup.

Out if interest, have you considered using any of the other models? Depends a lot on the use cases but it feels more and more that OpenAI are not currently in the mode of giving support or moving to a supported V1 product.

qhenkart · May 24, 2023, 1:54pm

While the status code is 429. This is not a rate limiting error . This indicates that the model itself is overburdened and has nothing to do with you or your account. It’s clearly documented in OpenAIs error handling documentation

There is nothing to do about it other than retry

jwatte · May 24, 2023, 6:39pm

Facebook’s isn’t commercially licensed, Google’s isn’t actually available outside a small set of selected alpha developers, and the rest aren’t as good as GPT-3.5. (And I need at least that level of capability.)

On a parallel track, I’m trying a fine-tuning approach on one of the better open source ones, we’ll see how that goes. At least it’s smaller so it infers faster…

lraus · May 25, 2023, 2:48pm

I’m doing the same - I will update here with how we get on… I think it is sadly becoming the only option

sammyjava · June 16, 2023, 2:13pm

I’m using gpt-3.5-turbo-0613 and finding much better response, I expect a lot of folks use gpt-3.5-turbo. Before that I used the March snapshot model. I’m generating chat completions from provided contexts, so a snapshot is perfectly fine for my purpose.

Topic		Replies	Views
[GPT-3.5-Turbo] ‘The server is overloaded or not ready yet’ errors API chatgpt , api	11	8859	February 4, 2024
429 error on gpt-3.5-turbo model with paid account API	6	1911	December 15, 2023
Getting a 503 error for multiple requests API chatgpt , api , chat-completion	7	9492	December 18, 2023
Status code 503: That model is currently overloaded with other requests API	33	38479	March 21, 2023
Hitting Rate Limit with small group of Users? API api-rate-increase	14	6248	January 20, 2024

The error message of "That model is currently overloaded with other requests. " using gpt-3.5-turbo

Related topics