What is the top_p default value?

martin.erca.2017 · February 6, 2024, 4:39am

Hello,

What is the top_p default value or is it disabled by default?

Thank you in anticipation.

Regards,

Martin

jorgeintegrait · February 6, 2024, 4:57am

Hi Martin,

As per the documentation, it defaults to 1

In the API Reference docs, Go to Chat > Create chat completion
https://platform.openai.com/docs/api-reference/chat/create

Cheers

martin.erca.2017 · February 6, 2024, 6:39am

Thank you.

The default to 1 confuses me considering the range is or seems to be 0 to 1.

Regards,

Martin

Diet · February 6, 2024, 6:43am

It’s the top 0% to 100%. in decimal form, top probabilities up to 0.0 (one item), or up to 1.0 (all items)*. What did you expect it to be?

* this is not 100% accurate, it’s technically \sum_{x_i \in V^{(p)}} P(x_i | x_{1:i-1}) \ge p

forumla ripped from here: Nucleus Sampling

edit: * it is actually accurate. see What is the top_p default value? - #8 by Diet

martin.erca.2017 · February 6, 2024, 6:49am

I thought the top_p range would have been the same as temperature (0 to 2).

Diet · February 6, 2024, 6:52am

why?

_j · February 6, 2024, 6:53am

Diet · February 6, 2024, 6:58am

I looked through the paper, https://arxiv.org/pdf/1904.09751.pdf, and 0 is actually allowed

if we take a closer look at the formula,

it’s the smallest set greater than or equal p

martin.erca.2017 · February 6, 2024, 7:00am

For the same reason, I am wondering why the temperature’s range is not 0 - 1

_j · February 6, 2024, 7:33am

Seems like the kind of thing your AI can answer…

On another topic

Documentation: Temperature parameter in Sampling Algorithm

Overview

The Temperature parameter or temp in an API, is frequently used in the Softmax function for the sampling algorithm during text generation, controlling the randomness or diversity of the generated text. It is a part of the mathematical background of logits or log-probs, which essentially translates the concept of ‘confidence’ into numerical format.

Sampling Algorithm

In AI systems, especially text generation, sampling algorithm is crucial for generating responses. In order to determine the next word in a sequence, the model calculates a specific score (or log-probability) for each potential word. These scores are then transformed into probabilities by taking the exponent and normalizing, a method known as Log-Scale Softmax.

The model then samples from these probabilities to select the next word. A high score translates into a high probability, indicating the model’s confident belief that this is the appropriate word to select next.

Temperature and Softmax

The temperature plays a significant role in the process of converting scores into probabilities. Technically, it is a scaling factor applied to the log-probs (or logits) prior to the softmax application.

By default, this temperature tends to be 1.0, observing the original confidence scores as determined by the model. However, it can be adjusted to tune the output.

Effects of Temperature Value

Temperature > 1: As the temperature increases (e.g., 2.0), the scores are essentially downscaled, which acts to flatten the probability distribution. As a result, the gap between the largest score and the other scores shrinks resulting in less distinction. This brings more diversity and randomness to the text generation, allowing less confident suggestions more chance to surface.

Temperature < 1: On the contrary, as the temperature decreases (e.g., 0.5), scores are upscaled. This widens the gap between the high and low scores, and the model becomes more confident in its predictions. This means less diverse output, but the outputs are more likely to follow the model’s initial, most probable predictions and maintain coherence.

Note: While 0 < Temperature <= 1 tends to generate more ‘human-like’, sensible responses, be aware that extremely low temperature values could make the model over-confident, which could lead to repetitiveness.

Conclusion

The Temperature parameter is a powerful tool allowing flexible control over the diversity versus quality trade-off in text generation. By understanding its operation, developers can apply this feature to get the most appropriate, diverse, or creative responses, depending on their specific use case.

Diet · February 6, 2024, 7:34am

That’s a good question!

If it’s based on softmax, it would look something like this:

as you can see, as you move the temperature up, it gets flatter and flatter, and the probabilities of individual tokens start becoming more or less the same. beyond 2, there’s not much more going on. most of the excitement happens between 0 and 2. why it’s exactly 2? not sure.

Topic		Replies	Views
What is the top_p value range? API	1	573	February 3, 2024
Does OpenAI have a top_k parameter? API	1	1309	February 3, 2024
Top K? How can I set it? API	3	3200	June 2, 2021
What are default API call settings? API	3	719	December 17, 2023
Openai api gpt4 vision => default value / behavior of "detail" param API	0	377	November 9, 2023

What is the top_p default value?

Related Topics