Tokenizer and playground calculated a mismatch between the number of tokens and the bill for text-Davinc-003

Here goes a small update:

so far, the gap holds at 11 tokens.

i tried several different prompts and, so far, there are 11 extra tokens that i can’t figure out from where they came.

Nope, there is no “boundary marker” or other theories.

The playground uses a bad tokenizer for the model. Consider:

"text-davinci-003": "p50k_base",
"text-davinci-002": "p50k_base",
"text-davinci-001": "r50k_base",

Now paste a whole bunch of varied text and switch playground between davinci-001 and -003. You will see that the token count doesn’t change.

Bad count using wrong BPE:

Correct count:
tokenizer-1

Change to text-davinci-001 in tiktokenizer and get the playground’s mistaken token count.

Worry not about that which you cannot control: other’s code. Record your own token usage by what is returned and compare to billing:

--response--
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "text": ""
    }
  ],
  "created": 1688568966,
  "id": "cmpl-xxx",
  "model": "text-babbage-001",
  "object": "text_completion",
  "usage": {
    "prompt_tokens": 1293,
    "total_tokens": 1293
  }
}
---

--response--
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "text": "\n\nThis code provides a function that returns the encoding used by a given model name. It uses a dictionary to map model names to their corresponding encoding, and also checks for model names that match a known prefix. If the model name is not found, an error is raised."
    }
  ],
  "created": 1688568986,
  "id": "cmpl-xxx",
  "model": "text-davinci-003",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 56,
    "prompt_tokens": 1107,
    "total_tokens": 1163
  }
}
---
1 Like

Hello, i am getting the wrong count using gpt-3.5-turbo-0613 as model.

using com.theokanning.openai-gpt3-java to consume api and com.didalgo:gpt3-tokenizer to count tokens locally.

so should i use davinci-001 instead of gpt-3.5-turbo ? is there any consequences?
right now i am just addind defensive code to avoid request more tokens than the model accepts and because of that more defensive code to avoid request negative max tokens values.

thanks for the guidance.

GitHub - didalgolab/gpt3-tokenizer-java: Java implementation of a GPT3/4 tokenizer.? “loosely ported from tiktoken”.

It was updated five days ago, and appears to support the chat model’s 100k tokenizer:

GPT3Tokenizer tokenizer = new GPT3Tokenizer(Encoding.CL100K_BASE);
List tokens = tokenizer.encode(“example text here”);

where you must use the correct encoding method for the model selected. You can verify token counts against quality tools and actual responses. And a reminder: a client-side tokenizer needs a large dictionary file.

text-davinci-001 is a completion engine and is not strongly-trained on instruction following. It requires a different prompting style to then get a different output - and costs much more. So there are few cases to use davinci - like if you don’t want to be blasted with disclaimers and warnings.

You probably want even smarter max_token code or conversation management. Consider it a reservation for the output that is required, you don’t want just 5 tokens remaining for an answer when you’ve dumped 4000+ input tokens.