Getting a 503 error for multiple requests

Getting a 503 Server Error: Service Unavailable for url: https://api.openai.com/v1/chat/completions when I try to send multiple requests to ChatGPT. We have been experimenting with that so this is an urgent issue as this will also impact our current users. My account is fully paid, have added a card, also paid $20 for it monthly.

1 Like

Api and ChatGPT are not related payment wise.

You pay for the API depending on which model you use and how many tokens you send and receive.

You pay even for sent tokens, when you send too many and get an error as response.

1 Like

I don’t get it, because I have paid for the API as well, then why am I getting errors? I’m using the Chat Completion API for the gpt-3.5-turbo-0613 model. And it has a limit of 3,500 RPM
90,000 TPM after 48 hours. Then what could be the reason I still get errors? And aren’t 5xx errors a fault on OpenAI’s end? I would get 4xx errors (429 for example I was getting a few weeks ago). The API mostly works though.

If I get a reason about why it’s failing, then I can narrow down to fix it. But 502 makes no sense.

Yes, you are right in that. Didn’t know you have that figured out yet.

Maybe it has to do with the data you are sending? Maybe something strange, explicit, etc.?

Do you use moderation endpoint?

The API seems to be working correctly.

I am having the same - I had to build a retry manager around it. Unfortunately, it comes quite often. It is not content related. In this case, I am translating the names of articles
grafik

for another product, I am getting the same. The other product is customer-facing, and having to retry adds latency to the UX.

Has anybody more info on this?
Thanks!

1 Like

Exactly, wrapping code with retry functionality is not an issue. The issue is the delay it will add for the end user. In my case, I am getting “comments” from a file which I chunk according to the prompt limit size, then send it to ChatGPT chunk-by-chunk.

I read somewhere yesterday that even though OpenAI has shared the limits of rate, it’s not necessary that they are followed.

That’s quite frustrating, even for just manual processing -
my translations process has an estimated failure rate between 4 and 6%

grafik

Perhaps batching requests could be a solution, I’ve read you can send multiple messages

1 Like