Rate Limit Issue With Fine-Tuned Model

I’m using fine-tuned models with a request volume that is comfortably within the 60 requests/min (per end-user) rate limit. Error message I receive:

status: 429
statusText: Too Many Requests
message: The server is currently overloaded with other requests. Sorry about that! You can retry your request, or contact support@openai.com if the error persists.

@tolga and @letterdrop flagged this in late Dec and early Jan but the issue appears to be ongoing.

Any workarounds or fix from the OpenAI team?

2 Likes

Hey @georg! Sorry for the trouble. Are you getting this error after a period of inactivity (say an hour or so)? Or while actively using the model?

I can’t say for sure yet but it looks as if it’s inconsistent and mostly happening after some period of inactivity.

I think this was the model loading back into our shared capacity. It should work if you retry after a couple minutes, we’re working on a few things to speed this up. It shouldn’t be an issue if you have continued usage.

Please message me if you continue to have trouble!

2 Likes

Hi Luke, can you elaborate more on how much is trigger time to consider it as inactivity?
Also, where can we reach you!

1 Like

I get that message while actively using the engine. Typically when it’s been sitting idle I get it for about 15 seconds, then I’m okay for a little while, then it tends to 429 me occasionally. I’m pretty sure that in the only use and I’d say I do less than 2 requests a second, so maybe they shared pool things I’m not really busy :slight_smile:
I also notice that it’s on each fine tune I have this experience, so leading a second tune means I’m likely to have to wait 10 or so seconds then I get results.

We’re working on reducing these

1 Like

It’s variable, so unfortunately can’t give you a concrete time.

We are getting this error as well.

@luke, I continue to get this error btw. I experimented with various scenarios and it’s not clear what causes it. It appears to be very inconsistent. Sometimes after longer periods of inactivity, sometimes when there are ~ 2 requests with 5 seconds. I run 4 different fine-tuned models and it happens across all 4.