BatchAPI is now available

Batch queue limits depend on the tier you are in. If 90,000 tokens are your limit that suggests that you may be in Tier 1 at the moment?

Under the usage tier documentation you can see the limits by Tier and model:

As for your other point, at any point in time you can have no more than the stated limit in the queue for batch processing. That’s just the way the system is designed.

1 Like

Is gpt-4o supported on the batch API?

yes, it is supported as well.

Thanks for response; do you mean I hit TPM (token per minute) limit, accidentally, by submitting 2nd batch too soon?

Right now I am at tier 3 already which 600,000 TPM and my 200,000-token batches work without errors

I am referring to the batch queue limit which is a separate limit:



You should be a developer already familiar with using the chat completions endpoint, and by having made requests in JSON direct to the API instead of through a library module.

If you are unfamiliar with how to create a “chat” manually, you will not have success in processing many of them into a specially-formatted file where each line is a job to be performed.

Thanks; something changed there, it is peak hours in America right now, and it is exceptionally fast, 600,000 tokens (via 6 files, 100 tasks each) takes about 5 minutes. I am at Tier 3; hitting gpt-4-turbo.

1 Like

Hello @jeffsharris
Thanks for the batch API. One question: how do you deal with long response (over 4096 tokens)? When I do one call by API, I just launch a second call with “continue” as user text (and pass first the response as assistant text).
And with Batch API? Thank you.