Fine tuning job API performance is below expectation

33 minutes to validate the files and 2 hrs in the queued state and job still not started. Is this the expected level of performance of your fine-tuning job api smh?

With bleeding edge tech and millions upon millions of users, I’d say they’re doing pretty good!

That said, I’m sure the network will improve as time goes on.

Are you in a rush or something?

1 Like

Validating a trivial file taking 33 minutes indicates the host is severely overloaded or out of non-virtual memory - either way it’s obviously a bug. The fact that their status system isn’t detecting it is another bug.

Welcome to the community!

I have never fine tuned before so I am not sure what the regular amount of time is. I just want to point out that there have been lots of fine tuning errors reported recently.

1 Like

Or…

It simply wasn’t a high priority to validate it because the scheduler knew it wasn’t going to start the training anytime soon.

So, it doesn’t look like a bug to me, you just need to temper your expectations a bit.

1 Like

Failure to scale to meet capacity is a bug. Silently doing it is also a bug

When the DMV agents are always working and never sitting around waiting for a customer, that’s the most efficient use of human resources.