It took a good amount of futzing around, and the prompt is only happenstance from other stuff I was trying, but I have an interesting result.
If you really want to have fun with statistics, do trials on two top logit token outputs that match to 8 digits of accuracy!
"top_logprobs": [
{
" Aug": -2.4173014,
" Oct": -2.4173014,
" Mar": -2.440739,
" Jan": -2.440739
}
]
- Aug = 8.92%
- Oct = 8.92%
- Jan = 8.71%
- Mar = 8.71%
model: davinci-002
max_tokens: 1
"prompt": """In the square brackets are 1000 random ASCII characters, using 0-9a-zA-Z: [0-9a-zA-Z]{1000}.
share|improve this answer
edited"""
Let’s run 70 trials at multiple settings. Extract the first letter each time.
“top_p”: 0.0891, temperature 2
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
“top_p”: 0.0892, temperature 2
OOOAAOAAAOOOAAAAOOAOAAOOAOOAOAOOAOAAOAAOOOOAAAOAAAOAAOOOAAAAOOOOAAOOAO
Thus, an exact top_p threshold where the next token is allowed.
Let’s continue:
“top_p”: 0.0892, temperature=0.000000001 (very A)
OAAAAAAAAAAAAOAAAAOAAAAAAAAAAAAAAAOAAOAAOAOAAAAAAAAAOOAAAAOOOOAAOAOAAA
“top_p”: 0.0892, temperature=0.0000000001 (all A)
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
And you won’t believe if we switch from miniscule to 0, a change:
First letter results of “top_p”: 0.0892, temperature=0.0
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
Or even if we release the top_p restriction, a change again:
First letter results of “top_p”: 1.0, temperature=0.0
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
The odd thing is temperature limit or top-p limit methods converge on a different token of the two allowed depending on setting.
Are they literally tied as far as top_p is concerned so the first seen is picked, while temperature is able to put distance between the probabilities?