Temperature and top_p interactions?

martin.erca.2017 · February 3, 2024, 9:14am

Hello,

OpenAI “generally” recommends altering “temperature” or “top_p”, but not both.

This suggests that one can alter “temperature”, but then should leave “top_p” on its default value, and vice versa.

I would like to understand why these two parameters should NOT be altered at the same time and what happens if they are?

Thank you.

Martin

_j · February 3, 2024, 9:55am

They can be altered separately. You just have to understand what they do.

top_p is first. It limits the logits that can be selected from. 0.90 = top 90% inclusive of tokens are valid, and for the following unambiguous case, only the top-1 token could be selected:

Then temperature is a divisor of the logprob, where a number smaller than 1.0 increases the distance between the top and likely tokens to make them more likely to be sampled from.

They can be used together, for example, here’s an ambiguous word choice. A top_p of 0.15 would limit to just the top 5 seen. A high temperature could then make the choices nearly equal in probability instead of “people” dominating the results.

Top-p is good for cutting off the unpredictable tail of tokens that would serve to confuse and break formats.

Diet · February 3, 2024, 10:04am

Would you agree with this simpler formulation?

Temperature is a scaling factor and pushes probabilities up across the board, and top_p controls your probability cutoff.

If you crank up the temperature to raise the token probabilities across the board by a certain amount, but adjust top_p proportionally so that the same tokens get sampled as before, it’s the same as if you didn’t do anything at all.

temperature 0, top_p 1 should have the same output as temperature 2, top_p 0.

_j · February 3, 2024, 10:15am

Setting either parameter to 0 is an invalid number, so OpenAI just uses a small value as a placeholder if you specify that. It’s not as small as a smaller number you can enter yourself, such as 1e-9, and this symptom can be evidenced in sampling surveys.

They are not the same and cannot be compared.

temperature:0.01 still allows lottery-winning odd tokens through, but the results are just highly weighted towards the top.
top_p: 0.01 allows only the top 1% of tokens through, which would only be one token possible even in some of the most unpredictable of circumstances:

Topic		Replies	Views
What is the top_p default value? API	10	12447	February 6, 2024
How does top_k and top_p work in conjuection with temperature Prompting temperature	1	2659	August 28, 2023
Why is the temperature and top_p of o1 models fixed to 1 not 0? API	3	4414	September 16, 2024
A better explanation of "Top P"? Prompting	10	122788	December 12, 2023
Questions regarding API sampling parameters (temperature, top_p) API gpt-35-turbo , api	18	9137	July 16, 2024

Temperature and top_p interactions?

Related topics