Dear OpenAI Support,
I would like to highlight a significant issue in the management of batch jobs via API that directly affects workflow stability and user trust.
Currently, when using POST /v1/batches
, a job may be rejected due to exceeding the 90,000 enqueued token limit (total input + max output tokens).
However:
- The error message does not provide the actual estimated token count (neither total nor per line).
- There is no pre-validation tool or endpoint to help estimate token usage beforehand.
- The token validation logic seems opaque and cannot be replicated by the user.
This results in:
- Workflow disruptions that cannot be debugged easily.
- Developers being forced to apply overly conservative estimates (e.g.,
characters / 3
) just to avoid rejection. - Suboptimal batch construction, with significant underuse of allowed token capacity.
#### A concrete case
In a recent batch:
- My input file had 4,973 characters.
- Using a conservative formula, I estimated ~1,657 tokens.
- But the real token count in
prompt_tokens
was only 879. - That’s 5.65 characters per token, far from the expected 3:1 ratio.
This discrepancy shows that the current lack of feedback leads to substantial inefficiency.
### Suggested improvements
I respectfully request the following improvements to the batch API:
- Add a visible estimated token count during batch submission:
- Either globally for the batch
- Or per job line
- In case of rejection, return:
- Estimated total enqueued tokens
- Line-by-line token estimates
- A breakdown of which lines caused the overage
- Provide an API endpoint to estimate token usage (
/v1/token-estimate
or similar), usable outside batch mode. - Once a batch is completed or fails, expose a live value (e.g.,
currently_enqueued_tokens
) to let users know:
- How many enqueued tokens are still “reserved”
- When those tokens will be released
### Additional concern: Token release delay is opaque
It is clear that the system keeps track of how many input tokens are currently “enqueued”, since it blocks new batch submissions based on this invisible quota.
The error message even includes an internal “customer code”, suggesting this value is tracked at account level.
However, after a batch completes (i.e. reaches status = completed
), there is no way to know when the previously enqueued tokens are released.
It seems that a background process (perhaps scheduled) eventually clears this quota — but the timing is unknown and undocumented. This adds further unpredictability to batch scheduling.
I suggest that the API should:
- Subtract enqueued tokens as soon as a batch completes
- Expose the remaining quota via API in real time
- Or at least provide a reliable release timing
### Final thoughts
As a developer building an industrial-scale application on top of the OpenAI API, I believe reliability and visibility are essential.
The current behavior — rejecting a batch for token reasons without showing the numbers or letting me know when I can safely retry — is not acceptable in a production environment.
I kindly ask the OpenAI team to consider this issue seriously, and improve the transparency and predictability of batch token management.
Thank you for your attention and the great work you do.
Sergio Bonfiglio