Why is the temperature and top_p of o1 models fixed to 1 not 0?

From: https://platform.openai.com/docs/guides/reasoning

temperature , top_p and n are fixed at 1 , while presence_penalty and frequency_penalty are fixed at 0 .

Why is temperature and top_p set to 1? This is a reasoning model, not a creative model. Wouldn’t setting temperature and top_p high increase the likelihood of hallucinations and selecting tokens that produce less likely true outcomes?

For me, that’s not just a theoretical prediction of what those values change in the output. In my experience across gpt-3, 3.5, 4, 4o, turbos, claude, phi, mistral, llamas, in eval environments, they all produce the best code (in terms of quality and in terms of sticking to the instructions) when temperature is 0 and top_p is very close to 0. I don’t mind non-creative and repetitive responses.

Please help me understand that choice for default temp and top_p.

1 Like

Anyone from Open AI care to respond?

It’s possible the temperature is dynamically determined and changes throughout the “thinking” and “reflection” stages.

1 Like

Possible, although there’s been temperature-looking problems reported when using other languages and getting oddball 3rd language tokens, so the creativity might be used in the wrong place.

Then if you want variations to evaluate a best response from, temperature is also important, like the best_of API parameter for completions (that uses logit probability totals instead of AI to judge). best_of: 10 is wasting your money with no sampling variety.

The poor-mans version from last year…