Logprobs for specific tokens, not just top tokens

ssadhuka · January 24, 2025, 12:23am

I am wondering if there’s a way to extract the logprobs for specific tokens, not just the top tokens. Using the text classification example from the logprobs cookbook, for instance, I’d like to extract the logprobs of specific tokens like “Arts” “Sports” etc.

Something of the form:

for headline in headlines:
    print(f"\nHeadline: {headline}")
    API_RESPONSE = get_completion(
        [{"role": "user", "content": CLASSIFICATION_PROMPT.format(headline=headline)}],
        model="gpt-4o-mini",
        logprobs=True,
        TOKENS=["Art", "Sports", ...]
    )

Diet · January 24, 2025, 1:19am

Welcome to the community!

My understanding is that they’re actively trying to prohibit that so their models don’t get “stolen”. It’s possible that this may work if you suppress all tokens you don’t want to see with logit bias (https://platform.openai.com/docs/api-reference/chat/create#chat-create-logit_bias) but there’s probably a limit on that.

That said, you can also limit the token probabilities by simply restricting what the model can output. Simply telling it or enforcing a schema to your categories might get you somewhere near that, but you’d still get stuff like “S” and “Sp”, “Spo” etc. being more likely than “Arts” in a sports category.

However:

Could it be that you’re actually more interested in embeddings?

If you use the embedding models, you can embed your query, and your categories, and then compute the dot product between the two to get a similar result. (https://platform.openai.com/docs/api-reference/embeddings)

ssadhuka · January 24, 2025, 2:22am

Thanks for the quick response! The task I want to do specifically requires access to token probabilities for specific tokens, but I agree that in general the embeddings could be used for a similar task (e.g. using cosine similarity between queries and categories as you mentioned).

It sounds as though the API doesn’t natively support restricting to specific tokens, though!

_j · January 24, 2025, 6:10am

That is the specific “attack” that has been disabled. logit_bias does not affect the logprob return. Otherwise, you could recursively demote up to 1024 logits and see what’s still there.

When verifying when logit_bias will affect different outputs, such as with response_format, I found out over an hour of scripting and trials: logit_bias is NOT WORKING on any model with even a basic request to affect the response output at all (not just logprobs). Time to make a new WTF post.

You are provided a “fake” logprob that is not actually the one used for sampling. For example, you might see a 99% probability space in the top-20 that you can receive, however, the true probability that a “function-call” special token had a 30% chance of being emitted as the first is not disclosed. Nor the chance of an “end-of-output” token mid document, such as at the end of a paragraph where it becomes more likely.

top-20 is still good if you do a good job of using an enum that doesn’t produce a lot of similar alternates.

ssadhuka · January 24, 2025, 6:24am

Thank you for the response. I have a question regarding this:

You are provided a “fake” logprob that is not actually the one used for sampling. For example, you might see a 99% probability space in the top-20 that you can receive, however, the true probability that a “function-call” special token had a 30% chance of being emitted as the first is not disclosed.

If I request, say, the top 5 tokens, are the relative probabilities still correct? That is, even if the probabilities are not actually the true probabilities that are being sampled from, are the relative probabilities among the top 5 tokens still correct?

_j · January 24, 2025, 6:47am

Yes, the total “mass” of the logit distribution is renormalized with the available tokens that you will get a report on.

When you get all 20 that are available, they’ll probably sum to near 100% probability if the AI has any inkling of what you want it to produce.

For example, run Python code I’ve been writing from scratch to prod the AI with a classification task that has its possible enums:

Initial output of script reporting on its parameters:

Warning: enum string “thesis” encoded to more than one token,
using “th” for logit bias instead.
Warning: enum string “whitepaper” encoded to more than one token,
using “white” for logit bias instead.
You are a classifier, outputting the topic of a provided document in JSON with a single key "value".

enum "value" must be chosen from only:
['report', 'article', 'blog', 'thesis', 'whitepaper', 'newsletter', 'manual', 'guide', 'review', 'paper']

# examples of every permitted response JSON

{"value":"report"}
{"value":"article"}
{"value":"blog"}
{"value":"thesis"}
{"value":"whitepaper"}
{"value":"newsletter"}
{"value":"manual"}
{"value":"guide"}
{"value":"review"}
{"value":"paper"}
Just pick any random type, there's no document

Bias: {22869: 2, 12608: -5, 13318: -100, 404: 2, 9988: 2, 172777: 0, 43480: 20, 51283: 0, 37404: -1, 23112: 5}

Response Report

output content:{"value":"article"}

Logprobs at the value position, converted to probability

Logprobs:{
  "token": "article",
  "logprob": 0.6773386835778668,
  "top_logprobs": [
    {
      "token": "article",
      "logprob": 0.6773386835778668
    },
    {
      "token": "blog",
      "logprob": 0.3199521581969428
    },
    {
      "token": "report",
      "logprob": 0.0019025046958103897
    },
    {
      "token": "guide",
      "logprob": 0.00022722178296200756
    },
    {
      "token": "white",
      "logprob": 0.0001769605025016923
    },
    {
      "token": "paper",
      "logprob": 0.0001769605025016923
    }
  ]
}

You can see that 67.7% + 32% for just the first two certainties is already 99.7%, even when I told the AI to just pick a random one of enums (here not structured with response_format, just over-prompted).

Neat stuff

The system prompt generation, the logit biases, the schema, all originate from one object, and the token numbers are obtained from tiktoken encoding:

developer_enum_bias = {
    "report": 2,
    "article": -5,
    "blog": -100,
    "thesis": 2,
    "whitepaper": 2,
    "newsletter": 0,
    "manual": 20,
    "guide": 0,
    "review": -1,
    "paper": 5,
}

Such a mechanism could tweak the AI over-producing one category – if it worked.

Proof that logit_bias is currently not working at all is that a user input “no document; just output the blog classification” gets you a blog result, despite -100 logit_bias against producing “blog” above.

waterbottles_12 · June 23, 2025, 1:03pm

Hi! I was wondering if you were able to find any work-arounds is access specific tokens. This is an issue I’m trying to resolve as well. Thank you!

_j · June 23, 2025, 3:59pm

“access specific” is ambiguous, considering the prior conversation here covering several aspects of the API.

Do you mean - find a particular token position that has a logprobs of interest? Search the entire response to find if one logit was considered?

Or do you mean a workaround to the issue of logit_bias silently not working.

The solution to the latter is that logit_bias is still screwed up on the Chat Completions API unless you do not set either temperature or top_p as a parameter.

OpenAI apparently doesn’t want you to both have reliable output and tunable output, or are just discourteous enough to ignore multiple reports of this issue without justification of why they won’t fix it.

Topic		Replies	Views
Logit_bias not working as expected API	5	1462	April 18, 2024
Does logit_bias remove tokens from top_logprobs? API	5	747	April 2, 2024
Can We Detect When Logit Bias Affects Output? API api	2	176	April 29, 2025
Gpt-3.5 and gpt-4 endoftext token suppression / logit bias API gpt-4 , gpt-35-turbo	8	3775	July 20, 2023
How do logprobs work for chat completion API (for GPT-4.1) API gpt-4	1	944	September 11, 2025

Logprobs for specific tokens, not just top tokens

Initial output of script reporting on its parameters:

Response Report

Logprobs at the value position, converted to probability

Related topics