Firstly, the article appears to be using an outdated tokenizer in its examples. Specifically, it states that the word “time” tokenizes to the ID 2435 and the word " time" (with a leading space) tokenizes to the ID 640. However, these token IDs correspond to the legacy GPT-3 tokenizer. The newer tokenizer used for GPT-3.5 and GPT-4 assigns different token IDs: 1712 for “time” and 892 for " time". Updating these examples to reflect the current tokenizer would reduce confusion for users.
Secondly, it would be beneficial to clarify whether it is possible to use logit bias with multiple token IDs for a single word. For instance, the word “brave” is composed of two tokens, “br” and “ave”, with token IDs [1347, 525]. However, the logit_bias parameter only accepts only integers between -100 ~ 100. Explicitly stating this limitation and providing guidance on handling such cases would be helpful for users attempting to apply logit bias to multi-token words.
The wrong token is incidental: the actual lookup of tokens is tedious and someone would rarely actually be looking to modify the word time. More important is to impress on the reader that there no longer a need to pick between three different tokenizers, all AI currently uses cl100k-base.
The logit bias does take negative values also. The input ranges from -100 (almost prohibited) to +100 (the AI can write nothing else).
You do point out a practical limitation, though: besides that there are many ways of writing words that all must be discovered, there are composite words where blocking the first token would significantly damage other vocabulary.
One minor addition to the discussion: I’ve recently implemented negative logit bias as part of one application that involves generation of summaries and the associated headings to get rid of GPT’s “favourites” such as “bolster”, “unveil” etc. I too struggled a bit on how to apply it to multi-token words but just gave it a try and included both token IDs. A few weeks in I have seen a solid improvement and clear reduction in the use of these words without a negative impact on the rest of the wording (as one might expect).