Coping with inconsistent results on identical inputs

_j · January 7, 2024, 10:28pm

Seed is used by the multinomial probability sampler. The only place it makes sense. Only its use is documented though, not its technology.

When you have a default temperature and top-p (or top-k), a token with 50% predicted normalized certainty will appear 50% of the time. In image above, you see the logit probabilities of a particular token position.

The choice is made by random numbers.

If the random algorithm is set to the same state each time, the same random sampling choice is performed the same way.

However, it is the input certanties which come before this softmax that seem to have non-determinism, in OpenAI models since 3.5-turbo and ada embeddings.

When there is ambiguity in token choice, such as in creative writing, the top two values may be very close in probability, and just a bit of variance originating from model “flaws” can cause them to switch positions or for other tokens to hold different values in the random probability space, regardless of any following sampling selection methods.

Topic		Replies	Views
Seed param and reproducible output do not work API	24	9059	March 10, 2024
The seed inference parameter for reproducibility API	5	4368	December 13, 2023
How can i reproduce chat completions? API gpt-4	11	1668	January 26, 2024
Is there a way to set a a "random seed" for responses with temperature > 0? Prompting	11	11461	November 6, 2023
Achieving deterministic API output on language models - HOWTO API statistics	3	3703	December 21, 2023

Coping with inconsistent results on identical inputs

Related Topics