When using OAI chat completions API, I noticed there is a huge difference in token counts for user messages vs tool messages for CJK languages, especially when the number of messages increase.
I did an experiment with the text = 日 repeated 1000 times. I created some requests that use that text as either user messages or tool messages and call OpenAI chat completions API (gpt-4o model) and compare token counts using tiktoken and token counts returned from the API (via token_usage). I noticed that for user messages there is not much difference in those token counts but for tool messages, there are a huge difference, especially when the number of messages increases. And this issue doesn’t happened with English. Here are the results:
As you can see that when the number of messages is 5, the number of tokens returned by the api for tool messages is 20,144, while the number of tokens returned by the api for user messages is 2,539 and the number of tokens returned by tiktoken is 2,641.
