Temperature, top_p and top_k for chatbot responses

Helveticus · July 11, 2023, 10:22pm

Hello

I’m using GPT as a chatbot. I have successfully fine-tuned the model on conversation data. For inference I’m now using temperature = 1, top_p = 0.6, and top_k = 35. In the following link it is written that for chatbot responses it is best to use temperature = 0.5 and top_p = 0.5. On the other hand, I have also read elsewhere that temperature = 1 or top_p = 1 should hold.

What values for temperature, top_p and top_k are best to use for chatbot responses? The chatbot should stick to the learned knowledge from the conversation data (and not hallucinate facts) but should also not produce repetitive responses (be somewhat creative).

_j · July 11, 2023, 11:43pm

Today we try out Claude v2:

Nucleus sampling is a technique used in large language models to control the randomness and diversity of generated text. It works by sampling from only the most likely tokens in the model’s predicted distribution.

The key parameters are:

Temperature: Controls randomness, higher values increase diversity.

Top-p (nucleus): The cumulative probability cutoff for token selection. Lower values mean sampling from a smaller, more top-weighted nucleus.

Top-k: Sample from the k most likely next tokens at each step. Lower k focuses on higher probability tokens.

In general:

Higher temperature will make outputs more random and diverse.

Lower top-p values reduce diversity and focus on more probable tokens.

Lower top-k also concentrates sampling on the highest probability tokens for each step.

So temperature increases variety, while top-p and top-k reduce variety and focus samples on the model’s top predictions. You have to balance diversity and relevance when tuning these parameters for different applications.

OpenAI recommends only altering either temperature or top-p from the default.

Top-k is not exposed.

Nucleus sampling parameters alone cannot stop an AI from hallucinating, but they can keep the output on a path of low perplexity. When the temperature is set high, alternate token choices can be made that are not a good fit:

The cause of most astronaut deaths in one word?
Acc = 87.53%
M = 5.81%
Expl = 2.98%
F = 0.60%
Mis = 0.34%

You can see that one mis-step can send the conversation on a whole new course.

Try temperature 0.4 unless you really want unexpected writing. The person chatting about computer code will appreciate it.

8troy · July 12, 2023, 6:17am

Great write-up!

I’ll need to try some sample prompts with a few different settings.
Any thoughts on setting both temperature and top-p to non-default values (despite recommendations)?

Some notes from my testing, mainly writing code for a specific task, using:

Chat Completions
gpt-3.5-turbo

Even with the prompt significantly massaged, some instructions in the System Prompt are ignored.

Increasing the temperature to 1.5 almost always gets me the expected behavior - although repeated calls are much less reliable, and the overall cohesion of the answer is compromised.

It’s possible my prompts just need to be streamlined further - get more specific, a little more verbose?

Several calls with the same Prompts usually gets me enough good code to get around any mistakes.

This is not ideal, and I will be trying the configs you have mentioned.

_j · July 12, 2023, 7:06am

softmax temperature can be though of as the amount of noise injected into the decision-making process

top-p can be considered a weighting that pushes more towards selecting top results

The current models don’t need the temperature increased to be “creative”, they already produce poorer tokens than before. Increasing will only help to break deterministic output for you on repeated runs.

Helveticus · July 12, 2023, 11:30am

Thank you very much for your answer. So you recommend temperature = 0.4 and top_p = 1?

I’m also using a local Huggingface model (GPT-J) where I can set top_k. What value would you recommend for top_k in that case? The default value is 50 for top_k.

_j · July 12, 2023, 12:13pm

The top-k is how many tokens from the highest ranking ones are to be considered; others below that are excluded. Setting it to 1 and you are almost guaranteed the predicted choice that temperature can’t affect. The quality of tokens goes down quickly after the first few, you might get some extra carriage returns, comma to continue a sentence instead of a period, more hyphens as a line break, different ways to start producing a list of 10 fun facts. And it depends on if they have near equal weighting or instead a clear answer.

How many more possibilities do you need for “a yellow fruit” than a banana that will always be chosen? A lower number might have an infinitesimal improvement on performance. A number equivalent to the whole token dictionary size doesn’t really matter if the temperature is under control. Good to let the AI company choose the optimum.

Helveticus · July 12, 2023, 5:23pm

Thank you very much for your explanation. So what is OpenAI using as top_k value by default?

_j · July 13, 2023, 8:09pm

Unknown. The top-k is big enough that a ridiculously-high temperature gives ridiculously-unlikely tokens.

sanjay.singh · August 18, 2023, 10:32am

Hi can you share if temperature and Top K parameters be set only using ChatGPT API or is there any way through which I can change these parameters while using ChatGPT on Open AIs website?

_j · August 18, 2023, 6:05pm

There are two separate things: ChatGPT, and the API for accessing AI models via software.

ChatGPT, the website, doesn’t let us know the settings they use and there is no hack way of discovering it (although one could do statistical analysis over hundreds of the same session).

The setting of ChatGPT is likely similar to recommendations seen in various documentation: temperature=0.7, top-p=1.0 (default, no limitation). The top-k is likely unset, meaning all 100k tokens are considered.

Topic		Replies	Views
What is the best temperature and top_p for a chatbot? API gpt-35-turbo , prompt	4	5198	December 21, 2023
Does temperature go to 1 or 2? API	6	26764	January 12, 2024
Cheat Sheet: Mastering Temperature and Top_p in ChatGPT API API	36	274472	January 29, 2024
A better explanation of "Top P"? Prompting	10	127621	December 12, 2023
Ask about GPT4 temperature? API gpt-4	1	3392	September 4, 2023

Temperature, top_p and top_k for chatbot responses

Related topics