Using the API the token count is off

maximum2000 · November 22, 2023, 10:12am

I am not sure why I am getting a higher token count. Maybe I am misunderstanding the token count. See the image below. I made a Postman request with just saying “hi” and the response coming back is “Hello! how can I assist you today?” The response coming back is correct 9 tokens but just saying hi is 8 tokens? I expected the prompt token to be 1 in this instance.

TonyAIChamp · November 22, 2023, 10:34am

Is it behaviour stable and reproduceable or a single occurrence?

adam.suskiewicz · November 22, 2023, 10:59am

It’s very simple, ask chatGPT
There are some “additional tokens involved for processing the message”.

I need to count tokens. So I need to know your internal representation of prompt when I send “hi” to gpt-3.5-turbo in plain text.

When you send a message like “hi” to GPT-3.5-turbo, the internal representation of the prompt in terms of tokenization is quite straightforward. The word “hi” itself constitutes one token. However, there are additional tokens involved for processing the message, such as special tokens used to denote the start and end of a prompt, or tokens representing spaces or punctuation if present.

For the message “hi”, the token breakdown would typically be:

A special token to indicate the start of the input (like <|startoftext|> in some models).

The token for “hi”.

Potentially a token for the end of the input or a separator (like <|endoftext|>).

In total, this would typically amount to 2-3 tokens, depending on the exact formatting and the model’s tokenization rules. It’s important to note that the actual token count can vary slightly depending on the specific implementation and version of the model being used.

adam.suskiewicz · November 22, 2023, 11:06am

It’s stable. I replicated this behaviour in my Postman.
Additional tokens are needed for communication.

_j · November 22, 2023, 11:18am

The containerization of messages within special tokens causes token overhead. The part where “user” is prefixed to what the user wrote, and “assistant” being inserted where the AI should write also.

You can send “hi” to a completion model instead and only be billed for what you see. What you get back might not be so concise in token usage though…

adam.suskiewicz · November 22, 2023, 12:05pm

Here You have a very good explanation of this special tokens, which creates envelope for the message: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
and here part of the tokenizer code (chapter: 6. Counting tokens for chat completions API calls):

elif model == "gpt-3.5-turbo-0301":
    tokens_per_message = 4  # every message follows <|start|>{role/name}\n{content}<|end|>\n

It should explain why there is not 1 for “hi”.

maximum2000 · November 22, 2023, 6:40pm

Thank you, everyone, for responding. Having the user’s text with added system decoration makes sense and OpenAI seems to have it documented in the Python notebook increases the token count. I expect the same for the return tokens. This doesn’t seem to be the case.

BTW: I did add the additional decorations and I couldn’t make it 8 tokens

hmarr · November 22, 2023, 9:00pm

I went down this rabbit hole a while back and wrote up a blog post about it, and published a JavaScript library to help estimate the token counts.

maximum2000 · November 22, 2023, 10:22pm

Yeah, it would be great if OpenAI were to comment so the mystery of tokens can be resolved. Keeping customers in limbo but charging them is not what I would expect.

_j · November 22, 2023, 10:29pm

The current version of ChatML and tokens that are trained in different circumstances are not published. However, when you account for the correct tokens (which cannot be joined) and the inserted unseen text, your count will come out right.


<|fim_prefix|>system<|fim_middle|>You are not ChatGPT<|fim_suffix|><|fim_prefix|>user:name_fieid<|fim_middle|>Bake a cake<|fim_suffix|><|fim_prefix|>assistant<|fim_middle|>

Overhead: 7 for first message, 4 per additional

908408978 · January 16, 2024, 4:15pm

Cool! I am curious about what does the <|endofprompt|> do?

Topic		Replies	Views
Prompt tokens usage seems too high API api	1	2564	January 21, 2024
What is the reason for adding total 7 tokens? API chatgpt , api	12	3934	December 11, 2023
Chat Token counts inconsistency between playground platform and tiktokenizer API chatgpt , token	2	659	December 27, 2024
Prompt tokes are much lower than the number mentioned in the response API	6	82	January 10, 2025
When sending a message to OpenAI chat api does it add json special characters ex. "{" to the final amount of prompt_tokens? API chatgpt , token , billin	2	1311	August 29, 2023

Using the API the token count is off

Related topics