Frequent server_error on thread run

Hi all,

We use gpt-4o in production with assistants, threads, vector, and all that., and for the past 48 hours or more a lot of thread runs end up with server_error status and Sorry, something went wrong. error message.

Also, from time to time a random api request tends to hold the connection for exactly 10 minutes to give the response back.

This all is super annoying and is really hard to keep any level of good product quality for customers in production.

Anyone else experiencing the same?

Thanks.

I spent hours yesterday figuring out why my queue tasks are taking so long to complete, just to realise that 600 second API responses are happening on a regular basis.

What is happening with OpenAI server infrastructure right now?

I wasn’t using ChatGPT web interface, only API.
And since 2+ hours ago it went wild. Server errors on most of AI assistant thread run replies.

Maybe it’s connected.

Couple of problems I encountered by using OpenAI API:

  • API calls tend to respond with huge response time (not only LLM response, but any)
  • frequent server error statuses on thread runs
  • thread runs get stuck on cancelling status and eventually expire

This got me implementing a number of mechanisms to minimise the damage, like retries, scheduled retries, split queue tasks for one job, parallel queues, sequential queues, synchronous, asynchronous, you name it…

The most unstable API I’ve ever used and I used them a lot.

I have encountered the same issue in our production pipeline, which has worked well for almost 5 months. Now, I have to switch to a different implementation without any dependence on assistant api.

It seems it is gpt-4o, and highly-related to employing file_ids. And ongoing.

A vision question that is similar (and meta), using specifically gpt-4o-2024-11-20 instead of the general alias provides us success, along with gpt-4-turbo as the vision AI model:

(That the AI is wrong in thinking the screenshot of our Assistants error is itself, and reporting it cannot do exactly what it is doing when it produces the response, is just more amusement)

1 Like

I think today OpenAI should be deploying some services or using large-scale hardware resources to test the model, because today, whether it is API calls or the webUI interface, the dialogue quality of 4o has decreased significantly.

But I’m not sure if it’s just a problem with my personal account

Today, we officially switched our default model to gemini after gpt-4o responded “sorry I can’t help” for hours.

Not sure if this is also related, I’m getting a ton of stuck requests on o3-mini and 4o-mini. Essentially trying to create message, or create a run - I never get a response back from the API.

Looks like you’re right!
We have most of our assistants on gpt-4o models and some of them on gpt-4o-2024-08-11 model, and latter didn’t produce failed runs at all.

Thank you for this @_j !
Switching all to 4o-2024-11-20