Clarifications on setting temperature = 0

There’s also other sources of randomness like the state of the RNG and race conditions in multithreaded code.

The evidence that,

T = 0 \equiv the greedy algorithm

is the fact that when the temperature T approaches 0 in a stochastic process, the probability distribution becomes increasingly deterministic. At T = 0, the process will always select the option with the highest probability, effectively turning into a greedy algorithm.

This occurs because:

  1. The temperature parameter T controls the randomness in sampling.
  2. As T decreases, the probability differences between options are amplified.
  3. At T = 0, this amplification becomes infinite, making the highest probability option overwhelmingly dominant.
  4. Consequently, the process always chooses the most probable option, which is the definition of greedy sampling.

This behavior is consistent across various stochastic algorithms, including softmax sampling and Boltzmann exploration.

There’s no reason to do anything different here but if they were going to, it would be stated as they do in the whisper documentation.

3 Likes