No batches in progress but get " Enqueued token limit reached"

I still do not understand thought how it is supposed to work.

The documentations says, and I quote:

Batch API queue limits are calculated based on the total number of input tokens queued for a given model. Tokens from pending batch jobs are counted against your queue limit. Once a batch job is completed, its tokens are no longer counted against that model’s limit.

If I only have batch jobs with status completed or failed, and I submit these jobs:

{
    "model": "gpt-4o-mini",
    "batch_jobs": [
        {
            "batch_id": "batch_6782458a607c81908be2e86ddfed57fd",
            "tokens": 335092,
            "created_at": "2025-01-11 11:18:47.958"
        },
        {
            "batch_id": "batch_6782456a7830819097764f9ed69d3f04",
            "tokens": 341183,
            "created_at": "2025-01-11 11:18:15.175"
        },
        {
            "batch_id": "batch_6782454a0dac8190a6d7db49b397a92d",
            "tokens": 366453,
            "created_at": "2025-01-11 11:17:42.918"
        }
    ]
}

The first and the third one get status failed immediately and only the second started. Why? I do not understand. I store such information in my database to always check the amount of tokens being processed before submitting a new batch job, and I still get the limit error.

I’m using tiktoken with the encoder for gpt-4o-mini to calculate tokens and I give the calculation an error of +250 tokens.

Currently trying to see if I get more information or details from support but nothing yet …

1 Like
  1. Are you sure that it is the batch rate limit error being produced in all cases? There can be oddities about data in requests that you’d be able to submit as JSON to the API, but for some reason, the characters cause issues in batch.
    That can be a file that is not completely compliant with UTF-8 encoded JSON with proper escaping for the JSONL “lines” format.

  2. I wouldn’t give positional importance to this. Running through a list, you are still making individual API calls to set the input file id batch in motion. Nearly instant small messages. The preprocessing itself likely has work it has to do, encoding each of the API requests, checking against the rate limits, and those internal steps may finish at different times. Or: multiple files in the “validating” stage may give you rejections so you can’t exploit the input system.

What I would do is submit one at a time. And wait. Watch for the status " validating to change to in_progress (the status they SHOULD have is “enqueued”). Why? After validation, the batch token consumption of one particular job could have a better calculation being applied. Then you can submit the next against that.


Token counts are +4 per message and +3 per message list. Then you also have tools to count, images to count. You can send one of the jobs to normal API at max_tokens=10, and see that the prompt_token cost matches your enqueued.

It’s also not clearly stated, but the max_completion_tokens that you have in individual requests may also be counting against the enqueued tokens - as your max token setting does count against the rate limit of normal API requests.

2,000,000 tokens is the enqueued limit for tier 1. You can look at rate limits and see if the next prepayment tier of 10x the rate is useful after figuring out how to make the most of what you have. https://platform.openai.com/docs/guides/rate-limits?context=tier-two

Hi, has anyone been able to find a solution to this "Enqueued token limit reached for gpt-4o-mini … " error? I have been using this batch API for the past few weeks with no issues - however 2 days ago I started getting these errors and now I cant process a single request.

My best guess is somehow it might have to do with the ‘failed’ / ‘cancelled’ batches on my account that are somehow not being cleared from the cached tokens - adding the the enforced limit.

2 Likes

Ensure you have enough account balance to cover the entire (maximum possible) cost of your batch job. What has been reported is people’s overlooking their current credit balance, or recharge not being triggered to cover for such a case (or not being triggered at all). It shouldn’t give that error, but worth checking.

You could do a list batch for 100 jobs, and send a cancel to each of them, and then continue for the next page of 100 jobs you can list at once. Cancelling everything seen in the batch endpoint again might give resolution to what you suspect.

If it is not just a single job, but even a single small batch job that can’t be completed (to ensure you aren’t mis-estimating the request size), then a persistent account issue would need a message to the platform site API “help”. You’ll have to ensure they understand you don’t want a bot answer, that your account needs staff fixing.

I would try to look for anything that could be causing errors because now I have no running jobs and uploaded one of my jsonl files manually, tried to create a batch job via the Dashboard UI, and I only see it going from “validating” to “failed” with the enqueued tokens limit error.

I have checked the file again, and I don’t have 2,000,000 at all … not even close. It is just weird because yesterday I could create batch jobs correctly with batch jobs with around 500,000 tokens, and now this ones with less tokens are not being processed. At least some of them like in my previous example.

If I find something else that could fix or mitigate this issue, I will post it.

Thanks @_j !!

I have been testing but it is really confusing.

I will give some examples.

For reference, I am calculating my tokens like this. This is coming directly from OpenAI’s Cookbook.

def num_tokens_from_messages(messages, model="gpt-4o-mini-2024-07-18"):
    """Return the number of tokens used by a list of messages."""
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        print("Warning: model not found. Using o200k_base encoding.")
        encoding = tiktoken.get_encoding("o200k_base")
    if model in {
        "gpt-3.5-turbo-0125",
        "gpt-4-0314",
        "gpt-4-32k-0314",
        "gpt-4-0613",
        "gpt-4-32k-0613",
        "gpt-4o-mini-2024-07-18",
        "gpt-4o-2024-08-06"
        }:
        tokens_per_message = 3
        tokens_per_name = 1
    elif "gpt-3.5-turbo" in model:
        print("Warning: gpt-3.5-turbo may update over time. Returning num tokens assuming gpt-3.5-turbo-0125.")
        return num_tokens_from_messages(messages, model="gpt-3.5-turbo-0125")
    elif "gpt-4o-mini" in model:
        print("Warning: gpt-4o-mini may update over time. Returning num tokens assuming gpt-4o-mini-2024-07-18.")
        return num_tokens_from_messages(messages, model="gpt-4o-mini-2024-07-18")
    elif "gpt-4o" in model:
        print("Warning: gpt-4o and gpt-4o-mini may update over time. Returning num tokens assuming gpt-4o-2024-08-06.")
        return num_tokens_from_messages(messages, model="gpt-4o-2024-08-06")
    elif "gpt-4" in model:
        print("Warning: gpt-4 may update over time. Returning num tokens assuming gpt-4-0613.")
        return num_tokens_from_messages(messages, model="gpt-4-0613")
    else:
        raise NotImplementedError(
            f"""num_tokens_from_messages() is not implemented for model {model}."""
        )
    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name
    num_tokens += 3  # every reply is primed with <|start|>assistant<|message|>
    return num_tokens

BATCH JOB COMPLETED

Here I can compare my calculation of tokens that I am sending. This is because in the output file generated after a batch job is completed, you can retrieve the prompt tokens that OpenAI calculated.

  • Calculated tokens by myself: 70,058
  • Calculated tokens by OpenAI in the output: 87,055

New single batch job

I have created a new file for a batch job with 149,669 tokens. For this time, I have included the max_completion_tokens in the body of each task written in the file. Also, to my calculation of tokens, I have added this 128 since they should also count for the enqueued tokens. This batch job went also from ‘validating’ to ‘failed’ without me having any other running batch job.

I have checked my files for right encoding, and they are correctly encoded with UTF-8. There are no errors. Also, none of my previous batch jobs are still in a status that could indicate ‘enqueued’. They are either completed or failed.

Sadly, I have no ideas left. Also, no response from support.

UPDATE

It really seems like an issue to me because I have submitted another batch job with only 16,329 tokens and it goes directly to ‘failed’ even without having any running jobs at all.

I have parsed the .jsonl file that I used for this batch job, and everything is correctly encoded and formatted. I sent each of the defined task to the chat completions using the openai python library and every single request worked.

I will try to contact any other kind of support from OpenAI.

1 Like

Thanks for the tips.

  • I have plenty credit on my account to cover the max possible batch so that’s not the issue.
  • I tried processing a single batch with a single ‘job’ ~ 5000 token size and it still failed.
  • Will reach out to customer support and circle back here if I can a solution
1 Like

OpenAI has also been pinged to see if this is a larger issue.

1 Like

Thanks @_j : I tried getting help through the suggested ‘Help’ system but it did not result in any additional helpful information.

As a test, I made sure to loop through all my batch ids and ensure they were cancelled. Then tried uploading a valid file from storage (that was previously processed successfully) and it also resulted in failure.

Will keep watching this thread for more updates.
Thanks

1 Like

Rather, you would select “messages” in the pop-up help widget, click through the choices and avoid its suggestions that you read more, until you can report an API problem and actually type a message. It is a phone tree in bot form. It is also staffed by contractors that do the minimum, like sending back unhelpful AI messages, whereas you need action taken. Besides your account, the underlying system that could cause such an anomaly needs repair.

Hi all!

Just reporting back. I couldn’t report a bug or else regarding the API but since yesterday I have been able to submit plenty of batch jobs that are being processed correctly without getting some of them marked as “Failed” due to limit of enqueued tokens.

So the calculation of the tokens were correct, including the max_completion_tokens and for me, it just looks like there was definitely an issue during that day.

I hope everybody else is having no issues! :smiley:

Thanks again to everyone posting and helping here! :heart:

1 Like

Same for me! Same batches, same jobs etc but now it’s working.
Thanks for bringing attention to this everyone. All the best!

One approach that worked for me was creating a new API key in my default project, using it for about a day, and then switching back to my primary keys.

but then it is for sure an issue because today almost all my batch jobs are failing immediately. For example, I am currently running a single batch job with 40,000 tokens and all other jobs that I try to start get rejected immediately due to “exceed enqueued tokens”.

I hope this is a known issue to someone in OpenAI because there is no way I can get through the bots to report this issue properly.

Also another important thing I’ve noticed. The few batch jobs that are processed are almost entirely incorrect regarding the output of the tasks. I have a json schema to work with structured outputs and until yesterday, for maaany days, it was working perfectly :sweat_smile: but now the failed batch jobs ratio is way too big and the ones that go through are 95% incorrect.

Hi! Same here. I started using batch APIs yesterday, sending a single batch with 10 requests, each containing 2,500 tokens. Initially, everything worked fine for a few runs, but then I started encountering the error (I’m on Tier 1).

Even after waiting 12 hours, the issue persists. There are no in-progress batches, but as soon as I send a new batch, it fails immediately. I’ve always sent one batch at a time.

Another question: what exactly does the 90,000-token quota mean? Can I send a single batch that exceeds 90,000 tokens? The term “enqueued” isn’t very clear to me.

Same issue here! Really annoying, any temporary solution or turnaround for this issue?
Tried deleting all the Batch history and storage but no luck, looks like my account is flagged and every batch request fails instantly

In may case after about 24h everything was working again. But still… it is something not clear…

After some investigation, here are my findings:

My goal is to analyze images, processing one image per request. Each image is encoded in base64, and I use structured output.

When I send a single request (prompt + image) to client.beta.chat.completions.parse , it works as expected. The response indicates that approximately 4K tokens are used in total.

However, when I create a batch with the same single request, it sometimes fails with an Enqueued token limit reached error. Interestingly, if I copy the batch content and check it with the tokenizer tool, it shows around 100K tokens.

My hypothesis is that the batch API pre-processes the batch file without accounting for the fact that an image is involved, leading it to calculate 100K tokens instead of 4K. As a result, the request fails.

But here’s where it gets confusing: when the batch does work, what’s happening differently? My second guess is that if the batch request can start execution immediately, it bypasses enqueuing, allowing it to proceed even with the large token count. However, if the request has to wait, the tokens are placed in a queue, hitting the limit and triggering the failure.

What do you think?

I think it’s a bug in APIs, or at least it’s not clear from the docs…

2 posts were merged into an existing topic: Enqueued token limit reached for gpt-4o in organization ###

A post was split to a new topic: Enqueued token limit reached for gpt-4o in organization ###