The "enqueued tokens" bug is still active

Dear OpenAI Support,

I would like to highlight that the SEVERE ISSUE in the management of batch jobs via API is still active and NOTHING has done to overcome the problem.

Currently, when using POST /v1/batches, a job may be rejected due to exceeding the 90,000 enqueued token limit (total input + max output tokens).

I repeat the comments I made in my last post, because they are important:

  • The error message does not provide the actual estimated token count (neither total nor per line).
  • There is no pre-validation tool or endpoint to help estimate token usage beforehand.
  • The token validation logic seems opaque and cannot be replicated by the user.

Also please take into consideration that
I have no active batch in this moment, and the error raises hownever!

This results in:

  • Workflow disruptions that cannot be debugged easily.
  • Developers being forced to apply overly conservative estimates (e.g., characters / 3) just to avoid rejection.
  • Suboptimal batch construction, with significant underuse of allowed token capacity.

### Suggested improvements

I respectfully request the following improvements to the batch API:

  1. Add a visible estimated token count during batch submission:
  • Either globally for the batch
  • Or per job line
  1. In case of rejection, return:
  • Estimated total enqueued tokens
  • Line-by-line token estimates
  • A breakdown of which lines caused the overage
  1. Provide an API endpoint to estimate token usage (/v1/token-estimate or similar), usable outside batch mode.
  2. Once a batch is completed or fails, expose a live value (e.g., currently_enqueued_tokens) to let users know:
  • How many enqueued tokens are still “reserved”
  • When those tokens will be released

### Additional concern: Token release delay is opaque

It is clear that the system keeps track of how many input tokens are currently “enqueued”, since it blocks new batch submissions based on this invisible quota.
The error message even includes an internal “customer code”, suggesting this value is tracked at account level.

However, after a batch completes (i.e. reaches status = completed), there is no way to know when the previously enqueued tokens are released.

It seems that a background process (perhaps scheduled) eventually clears this quota — but the timing is unknown and undocumented. This adds further unpredictability to batch scheduling.

I suggest that the API should:

  • Subtract enqueued tokens as soon as a batch completes
  • Expose the remaining quota via API in real time
  • Provide a reliable release timing
  • If NO BATCH is active, please create a “release automatism”. Reading that phrase in the error message sounds like a joke.

**### LET ME REPEAT MY FINAL THOUGHTS **

As a developer building an industrial-scale application on top of the OpenAI API, I believe reliability and visibility are essential.
The current behavior — rejecting a batch for token reasons without showing the numbers or letting me know when I can safely retry — is not acceptable in a production environment.
I kindly ask the OpenAI team to consider this issue seriously, and improve the transparency and predictability of batch token management.

Sergio Bonfiglio