GPT generation process is non-deterministic by default. You can see a more thorough discussion on this on this thread (and its associated link). TL;DR: the problem is that the token with “highest probability” is ill-defined due to the finite number of digits that you’re using for multiplying probs and storing them. Hope it helps
2 Likes