The generated code varies every time, even with a low temperature

I am using “Assistants API” with gpt4o to generate the python code. We use “temprature=0.01” to get more less randomness with some instructions.
With this environment, the generated code varies in every threads even if it is exact the same prompt.

I am wondering if the instruction is interfering, or temperature have no effect on the code generation… Does anyone solve this issue?
I would appreciate your help. Thanks.

Setting temperature to 0.0 helps but it’s pretty much impossible to get completely deterministic outputs from the model. Setting the seed for your request to a constant value will get you even closer to the same output but I’ll tell you that even that doesn’t really work.

In addition to the fact these models are non-deterministic, OpenAI is constantly fine tuning the model so the weights are changing on you several times a day.

As far as I understand, the assistant API does not support the seed parameter yet. Is this correct?

I’m not sure but you could be right.

I found out this which may help to control the temp and top_p at the same time for the code generation purpose:

Correct. It’s only available for Chat Completion as chat completion as a standard don’t maintain previous conversation state.