I received a more detailed response from OpenAI support.
They explained that for the Batch API, the token usage shown is not complete — only the dashboard should be trusted.
In my case, it seems that one token corresponds to about half a character.
Here’s the reason they gave for why the token counts in the response API differ from the dashboard:
Why the Totals May Differ:
The token counts in the Batch API output files represent the tokens processed for the successful completion of each job. However, the totals in the Usage Dashboard and API may include additional factors such as:
- Retries or Partial Processing: If any retries or partial processing occurred during the Batch API jobs, these would contribute to the overall token usage.
- Overhead Tokens: Certain system-level operations, such as validation or formatting, may result in additional token usage that is not reflected in the Batch API output files.
On my dashboard, it shows no retries or failed jobs.
So the billing seems to be based on “internal token usage,” which I have no visibility into.