Don' see any option to request increase in api rate limit for batch API

I have batch requests (using gpt-4o-mini). But I see a batch API limit of 2M tokens per day (TPD). I have been trying to find ways to request an increase in the limit. I have a task that I want to run which potentially uses about 25M tokens. I don’t want to wait around for 23-24 days to run the task. I don’t see any option to request this on the organization/limits page or the project/limits page.

I am wondering if there is a way to request an increase, if only on a temporary basis since I will not be needing this capacity after the task if complete.

The limit is actually the maximum number of tokens that can be enqueued at a time (amount waiting).

Turnaround is under 24 hours, sometimes far less (and sometimes cancelled at 24 hours). So you could potentially push more through in a day by watching small jobs complete.

You can prepay your way up to a higher “tier”, the only request method being “sending unrefundable money”. $50 total paid, with the most recent payment more than 7 days after the first. Not exactly the “fair” that is “ensured” when you’re talking about under $10 total:

Rate limits ensure fair and reliable access to the API by placing specific caps on requests or tokens used within a given time period. Your usage tier determines how high these limits are set and automatically increases as you send more requests and spend more on the API.

The documentation is a fib about “send more requests”. You don’t have to make an API request, just a series of payments.

gpt-4o-mini

Tier RPM RPD TPM Batch queue limit
Free 3 200 40,000 -
Tier 1 500 10,000 200,000 2,000,000
Tier 2 5,000 - 2,000,000 20,000,000
Tier 3 5,000 - 4,000,000 40,000,000
Tier 4 10,000 - 10,000,000 1,000,000,000
Tier 5 30,000 - 150,000,000 15,000,000,000

You could also contract the batch services with an organization capable of 1000x that of tier 1.

1 Like

Thanks a ton @_j I eventually paid up and moved to tier 2. Now I can complete this task in 2 days. I wish they documented this table was somewhere public. I am not sure if the limit is on the number tokens enqueued. I had exhausted the 2MM token limit and had not batches enqueued and yet my subsequent results on the same day kept failing with the reason being the “token_limit_exceeded”. I think the reset happens at 12:00 am (I am not sure if it’s UTC or PST).

Anyway, this works for me as I intend to use up those credits (and much more) very soon anyway.

Thanks again @_j for your quick response!

1 Like

Great!

If you look at the quirky tier rate positioning, and aren’t seeking a discount beyond the possibility of a cache hit, you see that you also can be done in 15 minutes with a stream of individual API calls. :rocket:

3 Likes