No in progress batches but got " Enqueued token limit reached"

I still do not understand thought how it is supposed to work.

The documentations says, and I quote:

Batch API queue limits are calculated based on the total number of input tokens queued for a given model. Tokens from pending batch jobs are counted against your queue limit. Once a batch job is completed, its tokens are no longer counted against that model’s limit.

If I only have batch jobs with status completed or failed, and I submit these jobs:

{
    "model": "gpt-4o-mini",
    "batch_jobs": [
        {
            "batch_id": "batch_6782458a607c81908be2e86ddfed57fd",
            "tokens": 335092,
            "created_at": "2025-01-11 11:18:47.958"
        },
        {
            "batch_id": "batch_6782456a7830819097764f9ed69d3f04",
            "tokens": 341183,
            "created_at": "2025-01-11 11:18:15.175"
        },
        {
            "batch_id": "batch_6782454a0dac8190a6d7db49b397a92d",
            "tokens": 366453,
            "created_at": "2025-01-11 11:17:42.918"
        }
    ]
}

The first and the third one get status failed immediately and only the second started. Why? I do not understand. I store such information in my database to always check the amount of tokens being processed before submitting a new batch job, and I still get the limit error.

I’m using tiktoken with the encoder for gpt-4o-mini to calculate tokens and I give the calculation an error of +250 tokens.

Currently trying to see if I get more information or details from support but nothing yet …

1 Like
  1. Are you sure that it is the batch rate limit error being produced in all cases? There can be oddities about data in requests that you’d be able to submit as JSON to the API, but for some reason, the characters cause issues in batch.
    That can be a file that is not completely compliant with UTF-8 encoded JSON with proper escaping for the JSONL “lines” format.

  2. I wouldn’t give positional importance to this. Running through a list, you are still making individual API calls to set the input file id batch in motion. Nearly instant small messages. The preprocessing itself likely has work it has to do, encoding each of the API requests, checking against the rate limits, and those internal steps may finish at different times. Or: multiple files in the “validating” stage may give you rejections so you can’t exploit the input system.

What I would do is submit one at a time. And wait. Watch for the status " validating to change to in_progress (the status they SHOULD have is “enqueued”). Why? After validation, the batch token consumption of one particular job could have a better calculation being applied. Then you can submit the next against that.


Token counts are +4 per message and +3 per message list. Then you also have tools to count, images to count. You can send one of the jobs to normal API at max_tokens=10, and see that the prompt_token cost matches your enqueued.

It’s also not clearly stated, but the max_completion_tokens that you have in individual requests may also be counting against the enqueued tokens - as your max token setting does count against the rate limit of normal API requests.

2,000,000 tokens is the enqueued limit for tier 1. You can look at rate limits and see if the next prepayment tier of 10x the rate is useful after figuring out how to make the most of what you have. https://platform.openai.com/docs/guides/rate-limits?context=tier-two

Hi, has anyone been able to find a solution to this "Enqueued token limit reached for gpt-4o-mini … " error? I have been using this batch API for the past few weeks with no issues - however 2 days ago I started getting these errors and now I cant process a single request.

My best guess is somehow it might have to do with the ‘failed’ / ‘cancelled’ batches on my account that are somehow not being cleared from the cached tokens - adding the the enforced limit.

1 Like

Ensure you have enough account balance to cover the entire (maximum possible) cost of your batch job. What has been reported is people’s overlooking their current credit balance, or recharge not being triggered to cover for such a case (or not being triggered at all). It shouldn’t give that error, but worth checking.

You could do a list batch for 100 jobs, and send a cancel to each of them, and then continue for the next page of 100 jobs you can list at once. Cancelling everything seen in the batch endpoint again might give resolution to what you suspect.

If it is not just a single job, but even a single small batch job that can’t be completed (to ensure you aren’t mis-estimating the request size), then a persistent account issue would need a message to the platform site API “help”. You’ll have to ensure they understand you don’t want a bot answer, that your account needs staff fixing.

I would try to look for anything that could be causing errors because now I have no running jobs and uploaded one of my jsonl files manually, tried to create a batch job via the Dashboard UI, and I only see it going from “validating” to “failed” with the enqueued tokens limit error.

I have checked the file again, and I don’t have 2,000,000 at all … not even close. It is just weird because yesterday I could create batch jobs correctly with batch jobs with around 500,000 tokens, and now this ones with less tokens are not being processed. At least some of them like in my previous example.

If I find something else that could fix or mitigate this issue, I will post it.

Thanks @_j !!

I have been testing but it is really confusing.

I will give some examples.

For reference, I am calculating my tokens like this. This is coming directly from OpenAI’s Cookbook.

def num_tokens_from_messages(messages, model="gpt-4o-mini-2024-07-18"):
    """Return the number of tokens used by a list of messages."""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using o200k_base encoding.")
        encoding = tiktoken.get_encoding("o200k_base")
    if model in {
        "gpt-3.5-turbo-0125",
        "gpt-4-0314",
        "gpt-4-32k-0314",
        "gpt-4-0613",
        "gpt-4-32k-0613",
        "gpt-4o-mini-2024-07-18",
        "gpt-4o-2024-08-06"
        }:
        tokens_per_message = 3
        tokens_per_name = 1
    elif "gpt-3.5-turbo" in model:
        print("Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0125.")
        return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0125")
    elif "gpt-4o-mini" in model:
        print("Warning: gpt-4o-mini may update over time. Returning num tokens assuming gpt-4o-mini-2024-07-18.")
        return num_tokens_from_messages(messages, model="gpt-4o-mini-2024-07-18")
    elif "gpt-4o" in model:
        print("Warning: gpt-4o and gpt-4o-mini may update over time. Returning num tokens assuming gpt-4o-2024-08-06.")
        return num_tokens_from_messages(messages, model="gpt-4o-2024-08-06")
    elif "gpt-4" in model:
        print("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
        return num_tokens_from_messages(messages, model="gpt-4-0613")
    else:
        raise NotImplementedError(
            f"""num_tokens_from_messages() is not implemented for model {model}."""
        )
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  # every reply is primed with <|start|>assistant<|message|>
    return num_tokens

BATCH JOB COMPLETED

Here I can compare my calculation of tokens that I am sending. This is because in the output file generated after a batch job is completed, you can retrieve the prompt tokens that OpenAI calculated.

  • Calculated tokens by myself: 70,058
  • Calculated tokens by OpenAI in the output: 87,055

New single batch job

I have created a new file for a batch job with 149,669 tokens. For this time, I have included the max_completion_tokens in the body of each task written in the file. Also, to my calculation of tokens, I have added this 128 since they should also count for the enqueued tokens. This batch job went also from ‘validating’ to ‘failed’ without me having any other running batch job.

I have checked my files for right encoding, and they are correctly encoded with UTF-8. There are no errors. Also, none of my previous batch jobs are still in a status that could indicate ‘enqueued’. They are either completed or failed.

Sadly, I have no ideas left. Also, no response from support.

UPDATE

It really seems like an issue to me because I have submitted another batch job with only 16,329 tokens and it goes directly to ‘failed’ even without having any running jobs at all.

I have parsed the .jsonl file that I used for this batch job, and everything is correctly encoded and formatted. I sent each of the defined task to the chat completions using the openai python library and every single request worked.

I will try to contact any other kind of support from OpenAI.

Thanks for the tips.

  • I have plenty credit on my account to cover the max possible batch so that’s not the issue.
  • I tried processing a single batch with a single ‘job’ ~ 5000 token size and it still failed.
  • Will reach out to customer support and circle back here if I can a solution
1 Like

OpenAI has also been pinged to see if this is a larger issue.

1 Like

Thanks @_j : I tried getting help through the suggested ‘Help’ system but it did not result in any additional helpful information.

As a test, I made sure to loop through all my batch ids and ensure they were cancelled. Then tried uploading a valid file from storage (that was previously processed successfully) and it also resulted in failure.

Will keep watching this thread for more updates.
Thanks

1 Like

Rather, you would select “messages” in the pop-up help widget, click through the choices and avoid its suggestions that you read more, until you can report an API problem and actually type a message. It is a phone tree in bot form. It is also staffed by contractors that do the minimum, like sending back unhelpful AI messages, whereas you need action taken. Besides your account, the underlying system that could cause such an anomaly needs repair.

Hi all!

Just reporting back. I couldn’t report a bug or else regarding the API but since yesterday I have been able to submit plenty of batch jobs that are being processed correctly without getting some of them marked as “Failed” due to limit of enqueued tokens.

So the calculation of the tokens were correct, including the max_completion_tokens and for me, it just looks like there was definitely an issue during that day.

I hope everybody else is having no issues! :smiley:

Thanks again to everyone posting and helping here! :heart:

1 Like

Same for me! Same batches, same jobs etc but now it’s working.
Thanks for bringing attention to this everyone. All the best!