Gpt-3.5-turbo *Does Not* register logit_bias

Here is a clean demonstration, based on the “once upon a time” example from cookbook;

here we see the token for “time”;

now here are 2 cases;

without bias (as expected)

with bias (wtf)

I wonder if there are any canned responses then :thinking:

The AI just figured out how to start the response it wanted with " time" (892) instead of “time” (1712), by giving a a repeating letter, because the more commonly used token wasn’t demoted.

The AI is quite deterministic when given temperature 0.1 as in the example screenshot, so giving the same response or “rolling” the same dice result is completely expected, even when diverted by one unusual token.

Good point, @kennys1495 have you tried with a temp 0.5>

The contents are not really the concern, it is that there are multiple token representations for time, such as “Time”, " Time", “_time”, all of which could be used or could be substituted by a clever AI, just like it will still find a way to apologize if you take away all its canned words to do so.

While “Once upon a” leads it to a specific destination, three more tokens injected give it a turn of phrase that was completely unexpected 150 years ago:
poem

and we again see “a” repeated, 1. because it’s been programmed to chat and not to be a pure completion engine, 2. because completion is broken by the special tokens of the ChatML container, 3. because it doesn’t understand what you want.

Ah I see! Same situation with being helpful “assistant” too and no way to pull it away from that “assist” token (or subtokenized versions).

Temperature doesn’t help either (even at t=1 & top_p=1)

so yeah now I see better how hard the RLHI stuff stuck on gpt :rofl: