Rate Limits for Batch API

dsgreen274 · July 26, 2024, 5:59pm

Following are taken from OpenAI docs . I’m not able to understand the enqueued-prompt limits part . Can someone explain it simply with an example if possible ? Also the rate limit docs (https://platform.openai.com/docs/guides/rate-limits/usage-tiers) does not mention any limit on number of tokens in each batch request . so basically there is no such limit ?
Rate Limits

Batch API rate limits are separate from existing per-model rate limits. The Batch API has two new types of rate limits:

Per-batch limits: A single batch may include up to 50,000 requests, and a batch input file can be up to 100 MB in size. Note that /v1/embeddings batches are also restricted to a maximum of 50,000 embedding inputs across all requests in the batch.
Enqueued prompt tokens per model: Each model has a maximum number of enqueued prompt tokens allowed for batch processing. You can find these limits on the Platform Settings page.

There are no limits for output tokens or number of submitted requests for the Batch API today. Because Batch API rate limits are a new, separate pool, using the Batch API will not consume tokens from your standard per-model rate limits, thereby offering you a convenient way to increase the number of requests and processed tokens you can use when querying our API.

Topic		Replies	Views
How to handle batch API limit? API batch	4	2009	May 11, 2024
Is TPD Really Tokens per Day or Tokens per Batch API api	1	134	March 4, 2025
Some batches creation FAILED even though they were within the batch queue limit API embeddings , batch-api	0	159	December 23, 2024
Batch Usage and token limit with gpt-4o-mini API batch , batch-api , gpt-4o-mini	0	32	April 23, 2025
Batch API for GPT4 Vision limitations API	1	912	April 25, 2024

Rate Limits for Batch API

Related topics