What's the new tokenization algorithm for gpt-4o?

After release gtp-4o, I found that it uses new tokenization algorithm. So what’s the new tokenization algorithm for gpt-4o?

1 Like

The encoding name of the talkerizer corresponding to gpt-4o seems to be “o200k_base”.

1 Like

Thanks.

I’ve found the encoding file: https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken

4 Likes