The GPT-4o Batch API has been extremely slow

Since January 31, the GPT-4o model’s Batch API has been running significantly slower. Previously, I could get results within 1-2 hours, but since the 31st, it has been taking almost 24 hours, or sometimes even expiring before completion. Am I the only one experiencing this issue?

HI and welcome to the Developer community forum,

The Batch API is for tasks that you are happy may take up to 24 hours to run, that is the only guarantee with a call to that endpoint. While you may, and often do, get faster responses when load is low, the flip side is that when demand is high, things can take longer to get queued.

I often find running batch jobs on weekends and at times when the US working hours are over is best for speed.

1 Like

Hi, thanks for your reply!

I understand that results are expected within 24 hours. However, my point is that previously, it would typically take just 1-2 hours, and sometimes even faster on weekends. But since January 31, the performance has drastically slowed down. Also, I’ve noticed similar posts in the community, which suggests that this isn’t just an issue on my end.

I think that’s why they built in the 24 hours? When network load is higher, they might not have the compute to do the batches as quickly.

I understand it. But since the 31st, almost every batch has been slow. In the past, if I ran 10 batches, maybe 2 would take longer than usual. But now, if I run 10, I’d be lucky if even 2 finish at the usual speed.

There are also more cases where jobs expire after 24 hours. And it’s still slow even on weekends when demand is lower.

I’ve heard switching models helps, but I have no idea how different the results would be compared to what I was getting before, so switching isn’t really an option.

1 Like

I’ve found the same thing. My expiration rate has shot up from near 0 to >25%. I’ve halved the size of my batches to try to guarantee completion in 24 hours, but I’m still getting expirations.

I understand that OpenAI doesn’t guarantee completion in under 24 hours, but all of my batches were previously succeeding in < 2 hours, and now the same batches are taking the full 24 hours and still expiring at high rates.

From a business perspective, I now have to budget 48 hours for an answer instead of the ~2 hours that I became accustomed to.

1 Like

I am facing the similar issue. My batch is stuck at 2200/2208 for more than 4 hours now. And the overall batch is close to expire. It used to be blazing fast. Is this simply a degregation due to increasing volume? I might have to switch back to send out async standard api calls which is also painful to manage due to all the over-limit errors and re-try.

Hi Hao,

I solved the issue by switching models—I’ve been using the GPT-4o-2024-11-20 version for a while now. Hope it works for you too!

Cheers!

1 Like