Strange behavior with high logit_bias

      const chatCompletion = await this.openAIApi.createChatCompletion({
        model: 'gpt-3.5-turbo',
        temperature: 0,
        messages,
        logit_bias: {
          27000: 50, // "microwave" according to the doc
        },
      });

A simple Hello there! prompt would make the API hang for several dozen of seconds, and then answer something like this:

reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected reflected

(and more “reflected” words repeated ad nauseam, probably the size of the remaining tokens).

You might have already figured this out already but token 27000 does in fact correspond to ’ reflected’. Not sure which docs you’re referring to but they might be using a different encoder than the one used for GPT-3.5-turbo. And it’s giving that wall of text because you’re essentially telling the model that each time it tries to determine what word comes next, it should weight ’ reflected’ very highly, which is why it keeps repeating itself.

2 Likes

The issue, I think, regarding the original thinking that 27000 was “microwave” is that the current Tokenizer from the OpenAI site uses the older dataset, I built a quick cl100k_base one to remedy this, Tiktoken Web Interface cl100k_base (no logs, no data retention, runs in ram for the duration of the session only)