Logprobs and message.content are inconsistent

_j · April 11, 2024, 4:29am

Let’s talk about the the output of a language model in overall terms, in the order of operations performed to generate a single n+1 token based on the current input context.

Here’s a convoluted abstraction I just typed up, from lots of probing and some deterministic trials that can no longer be meaningfully done on current models. Some of these interjections into the process need more flowchart branches than I depict.

    flowchart

context_&_AI_pretraining --> embeddings --> Language_inference
context_&_AI_pretraining --> hidden_state--> Language_inference
context_&_AI_pretraining -->json_mode --> Language_inference
context_&_AI_pretraining -->run_supervision... --> Language_inference
Language_inference -- logits_dictionary --> logit_bias -- logits_dictionary --> softmax_production

softmax_production --> top_p--truncated_mass--> softmax_production
softmax_production -- logprobs --> temperature -- dictionary --> multinomial_sampler
multinomial_sampler -- token --> content_filter
run_supervision... -->  content_filter --"API filtering and containerization" --> output
Language_inference -- logits_dictionary -->  softmax_alt -- bias_ignored -->logprobs_return--"tokenstobytes"-->output

I don’t depict that the token is added to the context, and the generation repeats until interruption.

Each token is randomly selected: it is sampled.

Without alterations by top_p or temperature (using the default values of 1 for each) means that the AI certainty is directly translated into the chance that the token will be randomly picked as the output. Like a token lottery.

“Hello” might be a 75% certain response. If input is English, “こんにちは” might be 7.2e-8 certain, maybe to show up in one-in-a-billion generations, but still under consideration.

In your case, you have a “0” token at 79%. Run a million trials without altering the sampling parameters, and 79% of the trials will show “0”.

Without restraining the output, your AI decision-making on the input you supplied is more akin to a biased coin flip.

Topic		Replies	Views
Logprobs inconsistent between runs for 4o API logprobs	4	1187	September 11, 2024
Non-deterministic probabilities for first generated token in chat.completion? API	4	902	April 24, 2024
Logprobs for specific tokens, not just top tokens API api	5	803	January 24, 2025
Surprising logprobs outputs for first token if it's '0' API logprobs	1	879	March 25, 2024
Logprobs keep changing when using the same prompt in chat.completion API api	3	1503	March 5, 2024

Logprobs and message.content are inconsistent

Related topics