Coping with inconsistent results on identical inputs

Seed is used by the multinomial probability sampler. The only place it makes sense. Only its use is documented though, not its technology.

When you have a default temperature and top-p (or top-k), a token with 50% predicted normalized certainty will appear 50% of the time. In image above, you see the logit probabilities of a particular token position.

The choice is made by random numbers.

If the random algorithm is set to the same state each time, the same random sampling choice is performed the same way.

However, it is the input certanties which come before this softmax that seem to have non-determinism, in OpenAI models since 3.5-turbo and ada embeddings.

When there is ambiguity in token choice, such as in creative writing, the top two values may be very close in probability, and just a bit of variance originating from model “flaws” can cause them to switch positions or for other tokens to hold different values in the random probability space, regardless of any following sampling selection methods.