Run same query many times - different results

milton.leal · April 4, 2023, 4:21pm

I wonder if anyone knows why we get different results when running the same prompt multiple times in a row.

I have noticed in quite a lot of my experiments that if you set a cool-down time in between each run, the results tend to be consistent again. In all of these runs, I have set the temperature parameter to zero.

Do GPTs have any state? So running prompts one after another influence each other?

logankilpatrick · April 4, 2023, 11:17pm

OpenAI models are non-deterministic, meaning that identical inputs can yield different outputs. Setting temperature to 0 will make the outputs mostly deterministic, but a small amount of variability may remain due to GPU floating point math.

amirabdi · July 21, 2023, 6:16pm

Recently, we are noticing “full determinism” in responses given the same prompt with temperature=0. Has something changed?

Foxalabs · July 21, 2023, 6:17pm

Welcome to the forum!

It is very deterministic, but you may see some variation over hundreds of runs.

Teemu · July 21, 2023, 6:53pm

So would using a temperature of 0 be the most accurate?

Foxalabs · July 21, 2023, 6:56pm

Interesting question, I’m not sure to be honest. I have often found the deterministic answers to be acceptable, but perhaps not innovative, I think a high temp will produce a better answer “some of the time” and worse on others. 0 will get you a consistent reply but potentially not the best that is possible.

amirabdi · July 21, 2023, 7:53pm

Hey @boris , do you think anything has changed in terms of the underlying asynchronous floating point operations in GPU in the past few months that might have increased the determinism in OpenAI endpoints?

amirabdi · July 21, 2023, 11:11pm

@Foxalabs If temperature=0, does setting top_p to any values (either 0 or 1 or a value in between) have any effect?

boris · July 22, 2023, 12:03am

If temperature=0, does setting top_p to any values (either 0 or 1 or a value in between) have any effect?

No effect!

, do you think anything has changed in terms of the underlying asynchronous floating point operations in GPU in the past few months that might have increased the determinism in OpenAI endpoints?

No

_j · July 22, 2023, 12:28am

Technically, the temperature is a divisor of the logits of possible tokens.

So lets say I have two possibilities of tokens the AI might generate, and represent them one-dimensionally:

" the" = .3333
" a" = .2500

Dividing by temperature 0.5 is multiplying by 2:

" the" = 0.6666
" a" = .5000

which increases the distance between them with normalization.

A multinomial distribution function then picks by probabilities of likeliness. Probabilities equivalent to “die face 1”, expect “1” 16.66% of the time.

So temperature 0.000001 - massive favoring of top token. Divide by 0, don’t know their code replacement for that.

Top-p is a nucleus sampling parameter that also can be passed by API, removing low probability tokens from consideration. A very low value gives a corresponding effect.

(informed by GPT-2 code)

amirabdi · July 24, 2023, 4:43pm

@boris I understand nothing has changed in the OpenAI and software layers. How about the cuDNN, Cuda, and GPU drivers? Could something have changed there that might have increased determinism?

Topic		Replies	Views
Why does the answer vary for the same question asked multiple times Community api	8	2723	May 22, 2024
Why the API output is inconsistent even after the temperature is set to 0 API gpt-4	11	24551	December 21, 2023
Why is GPT-4 giving different answers with same prompt & temperature=0? API	6	16776	April 6, 2023
Observing discrepancy in completions with temperature = 0 API	9	17793	February 6, 2024
Ensuring Consistent Output with GPT-4o-2024-08-06 (Temperature Set to 0) Prompting gpt-4 , api	4	980	November 7, 2024

Run same query many times - different results

Related topics