Can We Detect When Logit Bias Affects Output?

Is there a way to verify whether logit_bias actually influenced the model’s output?

I’m trying to assess the impact of setting logit_bias in a request. For example:

  • Without any logit_bias, I get this result: "Whispers of the Engine Guru"
  • With a logit_bias applied to block "Engine" and " Engine" tokens, the result changes to: "Whispers of the Wrench Wizard"

This suggests the bias is working, but I’m wondering—
Is there any way to explicitly tell whether a logit_bias entry caused the model to choose a different token during generation?
Essentially, can we track whether any logit_bias items were actively avoided in producing the final output?

Here’re my requests:

Without logit bias set:

curl https://api.openai.com/v1/chat/completions
-H “Content-Type: application/json”
-H “Authorization: Bearer $OPENAI_API_KEY”
-d ‘{
“model”: “gpt-4o-mini-2024-07-18”,
“temperature”: 0,
“seed”: 42,
“messages”: [
{
“role”: “system”,
“content”: “Write a five word title for the story: In a small town garage, Luis, a quiet Honda mechanic, had a gift—he could diagnose any engine by sound alone. Locals said he spoke fluent Civic and Accord.”
}
]
}’

With logit bias set on “Engine” and " Engine":

curl https://api.openai.com/v1/chat/completions
-H “Content-Type: application/json”
-H “Authorization: Bearer $OPENAI_API_KEY”
-d ‘{
“model”: “gpt-4o-mini-2024-07-18”,
“temperature”: 0,
“seed”: 42,
“logit_bias”: {“7286”: -100, “11032”: -100},
“messages”: [
{
“role”: “system”,
“content”: “Write a five word title for the story: In a small town garage, Luis, a quiet Honda mechanic, had a gift—he could diagnose any engine by sound alone. Locals said he spoke fluent Civic and Accord.”
}
]
}’

Unfortunately, the most obvious method to see the effects of logit_bias, its resulting effects on logprobs in a return against a known path, has been disabled.

Logprobs now are reporting a filtered state of the logit dictionary, extracted before any influence by bias or sampling parameters. A second mechanism, delivering to you only a facsimile of what is going on, was introduced to avoid exploration of the models’ layers and parameters.

Further, OpenAI also introduced blocking logprob value of the generation if it falls below the top-20, with a value -9999. The introduction of this was bugged though, breaking logprobs for a long time.

Then, non-default sampling parameters seem to have an effect of simply disabling logit bias in a vast majority of cases. I haven’t fully cataloged when this happens, as it is likely one more “wontfix”, simply more obfuscation with intention behind it.

So you really just have the generation to go by, and its random picks from certainties.

You can use your example of -100 preventing that particular token (although there’s usually several fallbacks that might appear similar), or +100 sending the AI into a loop of producing nothing else. That can help you see if you actually employed the right token number or covered all bases, given the multiple ways something can be written (such as closing a string and ending a JSON with some newlines all as one token or countless options)

There’s really no “may have caused” to be found. You might have altered a logit in the mass from 40% to 10%, with the remainder then filling in the normalized space. Seed doesn’t even guarantee identical sampling from that, with the unavoidable non-determinism of models.

1 Like

Helpful insight - thank you!