Questions regarding API sampling parameters (temperature, top_p)

Top-p reduces the selectable tokens from the 100% probability space at 1.00 down to only those that include the top 20% (inclusively) at 0.20. It is not affected by temperature.

This will make probabilities like:
“banana” : 30%
“apple”: 20%
“orange”: 15%
(all others): 35%

into a limited set of logits such as:
“banana” : 66.66%
“apple”: 33.34%

Temperature then acts, for example lowering the temperature gives more distance between choices:
“banana” : 75%
“apple”: 25%

(temperature works on base “e” logarithmic probabilities, so the exact magnitude of the effect is not intuitive.)

Then the selection process will choose by their certainty, 25% of the time choosing apple.

Yes, top_p can reduce 100000 tokens to 30 even at a setting of 0.99, because it is calculated as if the probability of each top token was added until the total probability of each was greater than top_p, and then the inclusion stops. The probability of alternates gets low pretty quickly when writing structured English.

top_p = 0 is the most reliable for deterministic output, since temperature is floating point math into probabilities where 0 temperature is not defined and method must be replaced anyway. The first token then any second token should never have a probability below 0.

2 Likes