Counting tokens for chat API calls (gpt-3.5-turbo)

Great resource at the OpenAI Cookbook at GitHub if you haven’t found it yet…

This one explains why the token count is a bit different with ChatGPT… and what you can do…

Counting tokens for chat API calls

ChatGPT models like gpt-3.5-turbo use tokens in the same way as other models, but because of their message-based formatting, it’s more difficult to count how many tokens will be used by a conversation.

Below is an example function for counting tokens for messages passed to gpt-3.5-turbo-0301.

The exact way that messages are converted into tokens may change from model to model. So when future model versions are released, the answers returned by this function may be only approximate. The ChatML documentation explains how messages are converted into tokens by the OpenAI API, and may be useful for writing your own function.

Learn more (including code) at the Source


Yeeeeees, I was so looking for this. Thanks a lot @PaulBellow.
A little bit discouraging to see that the exact token calculation depends on the model.

Another relevant aspect of this notebook is that it seems to clarify the role of the key “name” in the messages structure. Kind of. It says that, if there is a name, “role” is omitted from the tokenization… :thinking:

1 Like