Feature request: Query token counts via API

We’re using Codex for program synthesis research at MIT. In our setting, we are generating many different prompts on-the-fly and want to fill the prompt window with as many examples of our task as possible. However, we’ve found that this process is a bit painful and are hoping to get some assistance from the OpenAI team.

Currently, estimating how many tokens are in the prompt involves a lot of guesswork. Our process looks something like:
(1) Make an initial guess of the number of characters that could fit in the prompt, based on an approximate tokens-to-chars ratio that we measured empirically.
(2) Query OpenAI API for a completion. If the request returns an InvalidRequestError, manually parse the error text to extract the number of tokens in the invalid prompt.
(3) Based on the delta between the estimated and actual number of tokens, reduce the size of the prompt and repeat the query.

@raf I read your post about the Tokenizer tool, which does exactly what we need, but is limited to the web browser. Is there some way to query this tool programmatically? If not, would it be possible to expose an API call to this endpoint? If we could know how many tokens are in our prompts ahead of time, then we could avoid making repeated API calls and save a lot of headache overall.

I also chatted with @jforte about this topic about ~6 weeks ago and he mentioned that he put in an internal feature request - not sure where that stands.


Hi @gg :wave:

Welcome to the community.

Here’s some tools that should help:

  1. node.js - gpt-3-encoder - npm
  2. python - Tokenizer

Still looking for a good solution here. As noted by OpenAI (OpenAI API), the Codex tokenizer uses a more efficient whitespace encoding, so token counts differ between GPT-3 and Codex. Is there a version of the huggingface GPT2TokenizerFast or some setting that replicates this behavior? Are there differences between the GPT2 and GPT3 tokenizers?

Hi @gg

Here’s something: