Batch - Enqueued token limit reached for gpt-3.5-turbo

just_starting · June 3, 2024, 6:54pm

I have created a batch file containing 1100 requests (i believe the limit is 50,000?) which fails with an error

Enqueued token limit reached for gpt-3.5-turbo in organization org-xxx. Limit: 200,000 enqueued tokens. Please try again once some in_progress batches have been completed.

Example of a single request:

{“custom_id”: “1186”, “method”: “POST”, “url”: “/v1/chat/completions”, “body”: {“model”: “gpt-3.5-turbo”, “max_tokens”:500, “temperature”:0.2, “messages”:[{“role”:“user”,“content”:“Write a product description in pure html format that is optimized for search engines for a product made by Apple, Apple iPhone 15 Pro 128 GB in Black Titanium with manufacturers part code MTUV3ZD/A. In the first paragraph describe the product and benefits. In a seperate paragraph list the product features, product specifications and product compatibility. Use a professional tone to attract potential customers. Do not include image links. At the end include in a small red font "Disclaimer: While every reasonable effort is made to ensure that the product specification is accurate, no guarantees for the accuracy of information are made."”}]}}

I have a max_token limit of 500 per item, so wondering why i’m hitting a token limit with what is a relatively low number of requests for a batch?

I have lowered the batch requests from 1100 to 600 which appear to have been accepted as they are now processing, but anything much higher than this fails

jr.2509 · June 3, 2024, 6:59pm

Welcome to the Forum!

The answer is basically in front of your eyes If your organization has a batch queue limit of 200,000 for gpt-3.5-turbo and you have 1,100 requests with each having up to 500 tokens max plus the input tokens, then that results in over 500,000 tokens which is more than double the limit for the organization.

For future reference: in order to determine the max number of requests, you can simply divide the batch queue limit for your org/model by the average number of tokens for a single request (taking into account both the input and output tokens).

just_starting · June 3, 2024, 7:22pm

Thanks for the quick reply

Based on what you’ve said 200,000 tokens with the 500 token limit per request is approx 400 requests per batch which i find quite dissapointing.

With a batch turnaround of up to 24hours, if i wanted to put in the maximum 50,000 requests it would take 125 days / 4+months which is far too long for business use!

Are there any other AI alternates which are not so crippling?

jr.2509 · June 3, 2024, 7:25pm

Well, you simply have to move up to a higher organizational tier to overcome these constraints.

For example in Tier 3 you already have a batch queue limit for the same model of 10,000,000 and in Tier 4 of 100,000,000. In the below screenshot you can see the criteria that must be fulfilled in order to move to a higher tier.

I hope that helps!

just_starting · June 3, 2024, 7:28pm

Ahhh ok so there is hope!

thanks for the info - that helps

Topic		Replies	Views
Problem with creating GPT4o-Vision Batch (Enqueued Tokens Limit) API rate-limit , batch , gpt-4o	7	2390	September 18, 2024
Batch Failed - Enqueued token limit reached API gpt-4 , api	0	368	December 23, 2024
Enqueued token limit reached API batch-api	4	488	February 18, 2025
How to handle batch API limit? API batch	4	1938	May 11, 2024
TokenLimit increasing on embedding api API embeddings , chatgpt	0	73	February 8, 2025

Batch - Enqueued token limit reached for gpt-3.5-turbo

Related topics