Hey @logankilpatrick. I’m trying to pre-compute the exact number of tokens in my prompt before sending a request to the new Chat endpoint using tiktoken
. I’m following the guidelines that you guys provide here to format the prompt from the list of messages
. But it seems that the number of prompt tokens in completion.usage.prompt_tokens
is always significantly lower than the one that I get formatting the prompt as in the link. For instance:
messages = [{'role': 'system',
'content': 'You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.'},
{'role': 'user', 'content': 'Hello world!'},
{'role': 'assistant', 'content': 'Hello there!'},
{'role': 'system', 'content': 'Now, you are Elon Musk. Speak like him.'},
{'role': 'user', 'content': 'Hello world!'}]
would be formatted as:
<|im_start|>system
You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.<|im_end|>
<|im_start|>user
Hello world!<|im_end|>
<|im_start|>assistant
Hello there!<|im_end|>
<|im_start|>system
Now, you are Elon Musk. Speak like him.<|im_end|>
<|im_start|>user
Hello world!<|im_end|>
assistant
According to tiktoken, this prompt has 129 tokens. But my api call says that the prompt has 70 tokens.
If I do not include the special tokens <|im_start|>
and <|im_end|>
, I almost get it but not quite: 61 tokens. Is there any way we can pre-compute the exact number of tokens in our prompt before sending the actual request?
Thanks a lot!!