Batch API request limit violates terms of service

To the best of my knowledge, batch API does not have any limit on number of queued requests, only a limit on number of queued tokens.

See the Batch API FAQ:

What’s the limit of how many requests I can batch?

There is no fixed limit on the number of requests you can batch; however, each usage tier has an associated batch rate limit. Your batch rate limit includes the maximum number of input tokens you have enqueued at one time. You can find your rate limits here.

Once your batch request is completed, your batch rate limit is reset, as your input tokens are cleared. The limit depends on the number of global requests in the queue. If the Batch API queue processes your batches quickly, your batch rate limit is reset more quickly.

See also Batch API guide

There are no limits for output tokens or number of submitted requests for the Batch API today. Because Batch API rate limits are a new, separate pool, using the Batch API will not consume tokens from your standard per-model rate limits, thereby offering you a convenient way to increase the number of requests and processed tokens you can use when querying our API.

I’m at Usage Tier 5. I’m using Batch API with text-embedding-3-small to process batches of 50,000 requests each with approx 200 tokens per request. After sending 20 batches (1,000,000 requests) in quick succession, here’s the error I receive:

Enqueued request limit reached for text-embedding-3-small
in organization org-XXXXXX. Limit: 1,000,000 enqueued requests.
Please try again once some in_progress batches have been completed.

For usage tier 5, the only limit should be 4 billion tokens per day. This would allow me to queue approximately 400 batches at once (4,000,000,000 / (200 * 50,000)) not 20 batches as I’m actually being allowed.

What am I missing?

5 Likes

I am running into the same issue, I have tried manipulating token count per request but still end up hitting this 1,000,000 enqueued requests limit.

2 Likes
  1. This is (probably) violation of OpenAI’s terms of service.
  2. OpenAI is losing embedding search business to competitors by doing this.

OpenAI tier 5 is 4B tokens / day, but in practice I’m getting ~600M tokens / day.

If OpenAI made their queue fast enough, one could process a 1 TB dataset for $2k and a 500 TB dataset (like CommonCrawl) for $1M. As of today however, this will take 1 year and 500 years respectively.

I don’t quite get that. Terms of use is quite unilateral to favor the company. As in, “no class action; forced arbitration if you don’t like it”.

Sad if true, but yeah I haven’t actually checked

@samuel.da.shadrach @patrickfli We’re running into this issue, too. Did you get any response from OpenAI on this, by any chance?

1 Like