Batch - Enqueued token limit reached for gpt-3.5-turbo

I have created a batch file containing 1100 requests (i believe the limit is 50,000?) which fails with an error

Enqueued token limit reached for gpt-3.5-turbo in organization org-xxx. Limit: 200,000 enqueued tokens. Please try again once some in_progress batches have been completed.

Example of a single request:

{“custom_id”: “1186”, “method”: “POST”, “url”: “/v1/chat/completions”, “body”: {“model”: “gpt-3.5-turbo”, “max_tokens”:500, “temperature”:0.2, “messages”:[{“role”:“user”,“content”:“Write a product description in pure html format that is optimized for search engines for a product made by Apple, Apple iPhone 15 Pro 128 GB in Black Titanium with manufacturers part code MTUV3ZD/A. In the first paragraph describe the product and benefits. In a seperate paragraph list the product features, product specifications and product compatibility. Use a professional tone to attract potential customers. Do not include image links. At the end include in a small red font "Disclaimer: While every reasonable effort is made to ensure that the product specification is accurate, no guarantees for the accuracy of information are made."”}]}}

I have a max_token limit of 500 per item, so wondering why i’m hitting a token limit with what is a relatively low number of requests for a batch?

I have lowered the batch requests from 1100 to 600 which appear to have been accepted as they are now processing, but anything much higher than this fails

Welcome to the Forum!

The answer is basically in front of your eyes :slight_smile: If your organization has a batch queue limit of 200,000 for gpt-3.5-turbo and you have 1,100 requests with each having up to 500 tokens max plus the input tokens, then that results in over 500,000 tokens which is more than double the limit for the organization.

For future reference: in order to determine the max number of requests, you can simply divide the batch queue limit for your org/model by the average number of tokens for a single request (taking into account both the input and output tokens).

1 Like

Thanks for the quick reply

Based on what you’ve said 200,000 tokens with the 500 token limit per request is approx 400 requests per batch which i find quite dissapointing.

With a batch turnaround of up to 24hours, if i wanted to put in the maximum 50,000 requests it would take 125 days / 4+months which is far too long for business use!

Are there any other AI alternates which are not so crippling?

Well, you simply have to move up to a higher organizational tier to overcome these constraints.

For example in Tier 3 you already have a batch queue limit for the same model of 10,000,000 and in Tier 4 of 100,000,000. In the below screenshot you can see the criteria that must be fulfilled in order to move to a higher tier.

I hope that helps!

2 Likes

Ahhh ok so there is hope!

thanks for the info - that helps

1 Like