Hello,
I’m reaching out publicly after approximately two weeks of extensive testing with the Batch API, investing significant personal time and resources. Despite multiple attempts to resolve these issues through direct support channels, the problems persist, and thus, I am providing a detailed, transparent report here.
Summary of Main Issues:
Billing discrepancies: Costs are unpredictable and do not match actual token usage.
High failure rates: Batch requests frequently have a failure rate up to 90%.
Detailed Results of Latest Experiment:
Experiment Conditions:
Start: April 26, 2025, at 04:24
Duration: 2 days, 6 hours, 24 minutes, and 17 seconds
Total batches sent: 93 (varying from 1 to 100 items per batch)
Model used: o4-mini-2025-04-16
Request Statistics:
API Responses: Total requests – 885; Successful – 758; Failed – 97.
Dashboard / Usage: Reports 1,838 total requests.
Conclusion: The Dashboard figures appear significantly inflated and inconsistent with actual API data.
Conclusion: Even disregarding my involvement in the free token program (up to 10 million tokens for the o4-mini model), Dashboard numbers are unreasonably high.
Billing Statistics:
Calculation based on actual token usage: Input – $3.100174; Output – $33.510385; Total – $36.610559.
Billing: Actual balance reduction matches Dashboard usage exactly.
Conclusion: Actual charges exceed calculated amounts by approximately 1.5 times.
Observations:
The discrepancy in the number of requests is extremely large, even considering failed requests; Dashboard figures exceed real data by more than double.
This discrepancy likely affects token count accuracy as well.
A key oversight on my part was not thoroughly documenting this issue when initially discovered. Initially, the issue arose with a 10-fold acceleration in fund depletion. Furthermore, there is currently no transparency regarding the usage of my free token allocation, complicating accurate billing verification. Nonetheless, even excluding free tokens entirely, I’ve been charged at least 1.5 times more than what would be accurate based on documented usage.
I kindly ask OpenAI to urgently investigate and address these significant discrepancies.
Thank you for your attention to this critical matter.
For pending jobs showing up in billing, I would wonder if its equivalent to “preauthorizing your card”, where your input and max_tokens is evaluated against your credit balance to pay for it.
That’s how it is with fine-tuning - where you needed enough funds to pay for the job even when it was free.
If not holding back funds per job, you could submit thousands of $0.10 batches against your $0.10 balance.
Then, you may need to wait a while for the billing to settle down, just as billing for usage can even show up only the following day.
I am willing to accept and agree to the described mechanics, provided that upon completion of the task, the resulting figures align with actual usage. As noted in my initial post, there is currently a substantial discrepancy between the number of requests submitted and the actual tokens consumed upon completion of all batches.
Hi, thank you for reporting the issue. I am unable to indetify your organization ID by the Batch ID for the specified dates. Could you please provide batch ID, request ID and the date range when those were run? I will investigate as soon as possible. Thank you!
Hi there. I am also experiencing significant discrepancies between (ex-ante) predicted tokens and requests, and the OpenAI-calculated (ex-post) tokens and requests. For example, on May 12th I submitted a batch job with 10k requests and 8M completion tokens were returned. OpenAI’s usage dashboard is reporting 24k requests and 63M completion tokens. I only see the single batch output in the Batches page, and the numbers simply do not align with what was submitted. Did they ever follow up with you on this?
Hi! Please create a Support Ticket with OpenAI, and provide your Org ID, Batch Job IDs, and the Batch Output you had received. We will investigate your issue and get back to you. Thank you!
Hi - you can request human help, reference this thread, and ask for an investigation. Please make sure to reference https://community.openai.com/t/1245120/ discussion in the ticket you open - Thank you!
Hey! Yes, they did follow up, and in my case, the issue was partially resolved. I’d suggest putting together detailed stats for one specific case and contacting them via the support chat. The process can take some time, but in my experience, the support team genuinely tries to help and won’t leave you hanging.