Issue with token counts for tool messages

t_phan · July 15, 2025, 7:58pm

When using OAI chat completions API, I noticed there is a huge difference in token counts for user messages vs tool messages for CJK languages, especially when the number of messages increase.

I did an experiment with the text = 日 repeated 1000 times. I created some requests that use that text as either user messages or tool messages and call OpenAI chat completions API (gpt-4o model) and compare token counts using tiktoken and token counts returned from the API (via token_usage). I noticed that for user messages there is not much difference in those token counts but for tool messages, there are a huge difference, especially when the number of messages increases. And this issue doesn’t happened with English. Here are the results:

As you can see that when the number of messages is 5, the number of tokens returned by the api for tool messages is 20,144, while the number of tokens returned by the api for user messages is 2,539 and the number of tokens returned by tiktoken is 2,641.

anon1374209 · July 15, 2025, 8:08pm

check out if you can get paid by this bug here, sounds like something that might get you paid for, but yeah, seems like a bug, might even be a sec bug too if exploitable

VeitB · July 15, 2025, 8:15pm

The bugcrowd program is for security related issues, not for problems with model behavior and the associated costs.

Model issues are specifically out of scope.

Topic		Replies	Views
Discrepancy in Token Counts Between tiktoken and API Usage for o4-mini/gpt-4o-mini Bugs api	1	405	May 28, 2025
Official tokenizer has huge count difference from OpenAI tokenizer API	12	5703	October 1, 2023
Discrepancy Between tiktoken Token Count and OpenAI Embeddings API Token Count Exceeding TPM Limit in Tier 2 Account Bugs embeddings , token , rate-limit	3	397	September 27, 2024
Chat Token counts inconsistency between playground platform and tiktokenizer API chatgpt , token	2	745	December 27, 2024
Parallel tool calls in chat completions causes token count overestimation from the API Bugs function-calling , tools	1	495	December 23, 2024

Issue with token counts for tool messages

Related topics