Deterministic Results Impossible for GPT-4o

zach-humata · December 17, 2024, 5:41pm

I’ve been trying to get GPT-4o via the Chat Completion API to give the same response given the same API arguments without success. Setting a seed and temperature to 0 has resulted in slightly more deterministic behavior but large amounts of randomness still exist.

Experiment: 10 API calls to GPT-4o with Same Arguments

Results: 5 unique responses

1x response repeated 4 times
1x response repeated 3 times
3x responses repeated 1 time

The system fingerprint was the same for all 10 API calls.

Here are the arguments I passed to the chat completion endpoint for this experiment:

{
  "model":"gpt-4o-2024-08-06",
  "messages": ["same message every time"],
  "temperature":0,
  "top_p":null,
  "max_tokens":2000,
  "stream":true,
  "seed":1
}

Is it impossible to get GPT-4o via the Chat Completion API to return the same response give the same API arguments?

_j · December 17, 2024, 5:49pm

It is possible to get the same response when the same response has a clearly correct answer and one way to write it.

You are correct, though. No parameter can produce truly repeatable results. Not even discarding or binning by fingerprint on calls made within the same minute.

Receive logprobs. You will be able to see the variations between calls in the values.

No explanation or justification of the cause or reason of non-determinism in generation has ever been offered. A seed parameter cannot prevail against this. The last model that was deterministic was text-davinci-003.

If you are sending the same messages input and want the same thing you know you received before, you can just hash and cache.

Foxalabs · December 17, 2024, 5:51pm

Hi,

The nature of GPU clusters and batching of API jobs means that variation can occur between runs, to get full deterministic behaviour the GPUs would need to be synchronised, and to do that would mean running at lower clock speeds reducing performance.

The upshot of this is that the models will always have some degree of variation in output across calls.

anon25271712 · December 18, 2024, 1:37am

ok, so would this be a work around for you or is saving a response not possible?

if same_prompt_1
return saved_output_1

would also probably be cheaper, or am I missing something?

zach-humata · December 18, 2024, 5:32pm

Appreciate the suggestion but I’m looking for ways to make LLMs deterministic in their response, not a hash and cache approach

merefield · December 18, 2024, 5:49pm

Perhaps you could go one layer out and describe the broader problem you are trying to solve?

Why do you feel determinism is necessary?

merefield · December 19, 2024, 12:36pm

(not even my local llama3.2 appears to be deterministic even with temp 0 and top_p 0.9)

Abbas11 · August 26, 2025, 7:32pm

@_j @zach-humata

did anyone manage a solution or workaround for this problem?
Im working on a problem where I need to get to get deterministic outputs

Topic		Replies	Views
Seeking Assistance on Achieving Determinism in OpenAI Models Community gpt-4 , api	2	1024	August 6, 2024
ChatCompletions are not deterministic even with seed set, temperature=0, top_p=0, n=1 API gpt-4 , api	9	2071	October 7, 2024
Achieving deterministic API output on language models - HOWTO API statistics	3	9828	December 21, 2023
Lack of determinisim even with temp 0 and fixed seed Bugs gpt-4	1	1369	May 24, 2024
Finetuned gpt4o is not deterministic API	5	393	August 26, 2025

Deterministic Results Impossible for GPT-4o

Related topics