Openai Api Error "The server had an error while processing your request. Sorry about that"

Same error… Some question get answered if they are short, but if they are long… then there it comes this message of error…

I’m getting the same error but on my AWS Lambda instance. It works on Google Colab.

Hello again, so it seems that this has started to happen again.

Should OpenAI increase the capacity of their servers?

Hello, today problems again with the API using a paid account.

Today, with a paid as you go account, i am now getting:
openai.error.RateLimitError: That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at if the error persists. (Please include the request ID xxx. After only 3 requests (we should be at 60RPM)

We find the reliability is extremely bad. It seems there is no capacity to support the amount of requests it is getting. So what is the point of getting the paid account? I can understand having issues in the free version, but the rate of errors in the paid account seems surprisingly high. More often than not we are having issues. Could anyone help at least with information to reset expectations about what we can do with this API?


I think probably that they weren’t expecting this usage of the API, so they would need to up their servers capacity.

I think they will fix this rapidly.

Meanwhile just add retries on your code. It should fix it for a while.

1 Like

happening again… The server had an error while processing your request. Sorry about that!
I am adding retries but this would only increase the server load and make things worse…

Same here, I have my server create an API endpoint using turbo3.5, sometime will works return a valid response but most of the time today return
The server had an error while processing your request. Sorry about that!

Same here, using turbo3.5, random The server had an error while processing your request. Sorry about that! errors

It’s not great , especially as it’s a paid product but need to remember that it’s in beta and does not have a SLA.

Having the same issue here, on average it is taking 6 repeat requests before I can actually get a response…

Beyond this. For larger requests I am seeing 10+minute response time (for a 3000 token request/response). Anyone else seeing dramatic slow down?

Have an app that was working fine up until a couple days ago. Less than 10% of my requests are going through and when they do the responses are extremely slow. The errors I am getting are 502 errors and this one:

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 60.0 seconds as it raised Timeout: Request timed out: HTTPSConnectionPool(host='', port=443): Read timed out. (read timeout=120.0).

Yes, the time needed to build an article with my turbo3-5 setup was around 70-80 seconds, last few days are obviously slower, anywhere between 150-250 seconds.

I am seeing response times of 3-5 minutes, with many requests timing out after 10 minutes (my timeout). Is that expected? If so, I need to rethink the entire approach.

I would double your timeout time during this outage/throttling period. My 3-4 minute requests are returning successful but often take over 10 minutes at the moment

the 3.5turbo very likely not works, use text-davinci-003 can still do the job,

Since yesterday all my GPT-4 api request timeout after 60 seconds. Anyone else experiences this? Until yesterday i could call the api with max tokens 2000 and it could take several minutes without a problem.

Well, I got two apps, one using Python hosting in the UK and another just using Vercel,
and one with Vercel works just fine, but the Python one keeps time out, server error, and error after a few requests. Have I updated to Python available version to 0.27.7.