Thanks everyone, good to know Iām not the only one getting these issues. It seems itās just a matter of trying again and again until it works then
I really encourage to include extra layers such as retries with backoff, fallback strategies and so on (if youāre not doing it yet). This technology is new and OpenAIās engineering team is doing an AMAZING job in scaling their servers up to cover all the increasing demand that theyāre getting. But this is still expected. Things can fail, and we (developers) are the responsible ones for coming up with sound strategies when they do.
Still happening a bit on 3/22, especially during US work hours. For sure build in some safguards into your code while the poor IT and Devops folks at openAI experience the biggest and fastest scaling challenge anyone has every faced.
Iām using the @backoff.on_exception(backoff.expo, openai.error.RateLimitError) from backoff library, but today I still see APIConnectionError and timeouts. Can you suggest how to account for these errors so that my loop of requests does not break?
You probably just need to set up a Try/ Except statement to handle all errors.
In the except clause you can either pass out some sort of placholder (e.g., None) or else re-queue the content for later.
Thanks, so i tried this:
for i in rlist:
try:
#mycode
except TimeoutError:
print("error")
continue
But the loop still breaks. Is TimeoutError correct here?
Iām based in EU and have run into these aborted connections a lot over the last week. Its definitely correlated to US working hoursā¦
Iād happily pay more (2-4x token rate) for a more reliable endpoint, at this stage but I donāt see that option. I hope that the team can figure this out soon, and in the meantime, Iāll be implementing various retry mechanisms like @AgusPG recommended above.
Following up on this topic, in case it helps. Thereās an API that lets you do a quick health check of every OpenAI model, so you can make your requests strategy depend on it. Itās still pretty easy to implement a health check service such as this one, doing dumb api calls from time to time. But in case you wanna try it out folks, you can check it here.
I like the Try Model X ā Try Model Y ā Try Model Z ā Retry Later
Is there a benefit to Ping Model X, Y, Z ā Try model Y if X down, model Z if Y down, etc.
My only guess is you could achieve lower overall latencies if you know ahead of time, is this the only benefit?
Yep, thatās pretty much it. Say that you have a client timeout of 30s per model. Models X and Y are down. It takes you 1 minute to get to model Z and get a completion out of it. This is killer for conversational interfaces, where the user will just run away if they donāt have their answer quickly
.
Pinging the models in advance and having a logbook of the health of each model prevents you from continuously trying to get completions of models that are having an outage. So you go straight for model Z (and only retry on it) if you suspect that models X and Y are having an outage.
This improves the UX, in my view ![]()
Just adding a āsolutionā Iāve found. I tried to capture different specific errors but I found that there are so many different errors the platform can throw (such as timeout, remote disconnection, bad gateway just to mention a few) that itās best to do a blank except statement for now (although not ideal). Iāve found this to work quite well for me
inference_not_done = True
for sample in samples:
while inference_not_done:
try:
response = openai.Completion.create(...)
inference_not_done = False
except Exception as e:
print(f"Waiting 10 minutes")
print(f"Error was: {e}")
time.sleep(600)
I do not agree with catching generic exceptions. Itās a bad practice. Also: you do not want to handle all your exceptions in the same way. There are some error where itās worth retrying, some others where itās worth falling back, and some others that you should never retry.
You can customize your app to handle exceptions on status codes, instead (such as āretry with this specific payload for all 5xx errorsā). For instance, in Python aiohttp_retry does a pretty decent job here. Itās the one that Iām currently using.
Hope it helps!
I am using ātrain_3_motive_en.csvā model to generate some texts since yesterday. Yesterday it ran well. However, I am getting this error for the last 7 hours:
openai.error.APIConnectionError: Error communicating with OpenAI: (āConnection aborted.ā, RemoteDisconnected(āRemote end closed connection without responseā))
Any idea? what is going on
Best Regards,
Zahurul
Iām using the chat completion endpoint, currently with the gpt-3.5-turbo. Before yesterday, for a few weeks now, Iāve been running dozens of requests every hours without any issues. Yesterday, about half of them started failing with the same APIConnectionError thatās been reported here. Today, most of them, around 80% of the requests are failing with that error.
Shouldnāt the status.openai.com page reflect that issue on the API?
Hi @AgusPG
I agree with you, catching generic expression is a bad engineering practice and it can have pretty bad consequences if done in software developing. My solution was more suitable for an NLP researcher looking for a quick fix and who just wants the output results to analyse offline. Being a researcher but with a previous developer background I see what you mean, but Iāve also seen a lot āworseā in research code than just catching a generic expression, in order to get things to work in the short term. Definitely not advisable for a scalable application, and if you have a live app with real users then my solution is not for you. And thanks for pointing out aiohttp_retry, Iāll look into it ![]()
Iām also still getting tons of connection errors, timeouts and 502s, even after yesterdayās fix. Backoff helps, but my requests often retry 3+ times before my serverless functions time outā¦
Oh yeah, absolutely. If you can work offline and do not need real-time, I agree that you can be more flexible as regards the software development part of your app
.
One observation Iām curious if any of you have witnessedā¦
Context: Iām using the text completion API (not chat) and my application built to iterate through various text, calling the API each time.
Observation: When I run this app/code it will work for the first 4-9 API calls, executing each in < 1 s, and then subsequent API calls will either be extremely slow (>90s) or fail with the exception in this thread.
Has anyone else seen this behavior? It seems like there is some unofficial throttling going on.
This could be true now you mention it
We run a chained query of 10 prompts. We often get an issue near the end of the chain. We catch it and retry and it continues after a short delay
Hadnāt considered it up to now
I am receiving this issue as well. Worse still, despite the API closing my connection without response I have been charged regardless!!
Iām using GPT-4 and getting close to the maximum token limit, which I believe has something to do with it. When I just run the test code that the API documentation suggests it runs flawlessly. Very strange!