Possible bug? gpt-3.5-turbo non-deterministic even with temperature zero

Hello everyone. I’ve been using text-davinci-003 and gpt-3.5-turbo for some time through the API and in the Playground, and I’ve noticed that the output is non-deterministic even though I’ve set the temperature to zero - i.e. sometimes I see different completions for exactly the same prompt.

I would’ve expected a temperature of zero to force the model to select the most probable token at each step and hence for its output to be deterministic. I understand that it might run with limited precision and there might be ties for the most probable token, but I would’ve expected ties to be broken deterministically - or for there to be an option for that.

The lack of determinism makes creating tests that wrap live models difficult and complicates prompt engineering - is the change in output due to the change in prompt or a result of the model’s non-determinism? It also makes cascades of models unstable - their outputs can change significantly even for exactly the same input due to a change in the output of one of the models in the cascade.

Is non-determinism with temperature zero the intended behaviour or is it a bug? If it is the intended behaviour, is there a workaround to make the models deterministic without making multiple calls to the API? I would really like the API and Playground to provide some way to generate deterministic completions. Thanks!

3 Likes

As a test case, using gt-3.5-turbo in Playground, with temperature zero and maximum length 2048, no system message and user message as follows:

OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI Incorporated (OpenAI Inc.) and its for-profit subsidiary corporation OpenAI Limited Partnership (OpenAI LP). OpenAI conducts AI research with the declared intention of promoting and developing a friendly AI. OpenAI systems run on an Azure-based supercomputing platform from Microsoft.[5][6][7] The organization was founded in San Francisco in 2015 by Sam Altman, Reid Hoffman, Jessica Livingston, Elon Musk, Ilya Sutskever, Peter Thiel and others,[8][1][9] who collectively pledged US$1 billion. Musk resigned from the board in 2018 but remained a donor and eventually committed US$100 million. Microsoft provided OpenAI LP with a $1 billion investment in 2019 and a second multi-year investment in January 2023, reported to be $10 billion.

In December 2015, Sam Altman, Greg Brockman, Reid Hoffman, Jessica Livingston, Peter Thiel, Elon Musk, Amazon Web Services (AWS), Infosys, and YC Research announced[11] the formation of OpenAI and pledged over $1 billion to the venture. The organization stated it would “freely collaborate” with other institutions and researchers by making its patents and research open to the public.[12][13] OpenAI is headquartered at the Pioneer Building in Mission District, San Francisco.[14][3]

According to Wired, Brockman met with Yoshua Bengio, one of the “founding fathers” of the deep learning movement, and drew up a list of the “best researchers in the field”.[15] Brockman was able to hire nine of them as the first employees in December 2015.[15] In 2016 OpenAI paid corporate-level (rather than nonprofit-level) salaries, but did not pay AI researchers salaries comparable to those of Facebook or Google.[15]

(Microsoft’s Peter Lee stated that the cost of a top AI researcher exceeds the cost of a top NFL quarterback prospect.[15]) OpenAI’s potential and mission drew these researchers to the firm; a Google employee said he was willing to leave Google for OpenAI “partly because of the very strong group of people and, to a very large extent, because of its mission.”[15] Brockman stated that “the best thing that I could imagine doing was moving humanity closer to building real AI in a safe way.”[15] OpenAI researcher Wojciech Zaremba stated that he turned down “borderline crazy” offers of two to three times his market value to join OpenAI instead.[15]

The completion is usually

OpenAI is an AI research laboratory founded in San Francisco in 2015 by Sam Altman, Reid Hoffman, Jessica Livingston, Elon Musk, Ilya Sutskever, Peter Thiel, and others. The organization aims to promote and develop friendly AI and conducts AI research using an Azure-based supercomputing platform from Microsoft. OpenAI consists of the non-profit OpenAI Inc. and its for-profit subsidiary corporation OpenAI LP. In 2019, Microsoft provided OpenAI LP with a $1 billion investment, and in January 2023, they made a second multi-year investment reported to be $10 billion. OpenAI pledged over $1 billion to the venture and stated it would “freely collaborate” with other institutions and researchers by making its patents and research open to the public. OpenAI is headquartered in the Pioneer Building in Mission District, San Francisco.

but sometimes

OpenAI is an AI research laboratory that aims to promote and develop friendly AI. It was founded in San Francisco in 2015 by a group of individuals who collectively pledged over $1 billion. OpenAI conducts AI research and runs on an Azure-based supercomputing platform from Microsoft. The organization has a non-profit arm, OpenAI Inc., and a for-profit subsidiary, OpenAI LP. OpenAI has hired some of the best researchers in the field, and its potential and mission have drawn researchers to the firm. OpenAI has pledged to freely collaborate with other institutions and researchers by making its patents and research open to the public.

Temperature 0 makes it more deterministic but it is still drawing from a pool of tokens and some randomness will always come into play. Unless you prompt it very specifically, you can always expect the results to be a bit different some times.

3 Likes

Hi udm17. Thanks for your reply. It sounds like you expect non-deterministic behaviour with temperature zero, so maybe I’ll just have to live with it. It doesn’t seem to be a problem with AI21’s models and I haven’t tried Anthropic’s.

I suspect that, in OpenAI’s case, it’s a result of load balancing between differently configured servers or performance optimisations or something like that. It might be by design - that ties for most probable tokens are broken randomly, for example. If that is the case, there definitely should be an option to have models behave deterministically - it would have lots of benefits for software testing, prompt engineering, etc.

Anyway, thanks for replying udm17 - much appreciated.

1 Like