Hello, someone knows why the token ID provided by OpenAI Tokenizer tool (OpenAI Platform) are “wrong” (at least different) compared to those provided by this tool: Tiktoken Web Interface cl100k_base ?

For example I like my bot not saying “Ah” at beginning of the sentence, it works if I set logit-bias like this: {"25797 ":-100}, but if I use the Token ID for “Ah” provided by Openai ID (10910) it dosen’t works.

Thank you if you have an explanations.


Because the OpenAI tokenizer website uses a previous tokenizer. It’s not current and you shouldn’t use it unless you’re using a GPT-3 model.

1 Like

Oh ok thank you so much !