Same prompt and parameter settings produce different responses

bj · May 10, 2022, 4:51pm

We are prototyping natural language question to SQL generation using code-Davinci-002. We observe a strange behavior. Under the same conditions [same schema input, same parameter settings (specifically temperature is set to zero)], two different developers are are getting two very different answers. The only difference that I can think of that exists is the developer locations.

The diffs are as follows:

The generated SQLs are different.
The returned response for one developer includes the schema itself while for the second developer, codex repeats the query over and over again till hitting the token limit.

Any thoughts on how to fix this issue?

ali.yeysides · May 10, 2022, 5:00pm

I would try exposing the probabilities of the tokens, might help with debugging.

jazzcript · May 10, 2022, 5:09pm

Increasing frequency penalty can help to prevent repetition. Also have a look at this topic to understand why it can’t be fully deterministic.

bj · May 10, 2022, 5:34pm

Thanks, will take a look.

bj · May 10, 2022, 5:40pm

Thanks. Will take a look at the frequency penalty. I fully understand the non-determinism in parallelization. Codex behavior seems to be deterministic within my own sandbox environment: same input produces same output on repeated testing. The non-determinism becomes immediately apparent across two developer sandboxes.

Topic		Replies	Views
Run same query many times - different results API	11	8511	December 21, 2023
Difference in token log probabilities when `echo` is `True` vs `False` Prompting	6	3243	December 21, 2023
Observing discrepancy in completions with temperature = 0 API	9	18068	February 6, 2024
Possible bug? gpt-3.5-turbo non-deterministic even with temperature zero API	4	4639	December 21, 2023
Possible bug? Nondeterministic logprobs with echo=True, max_tokens=0 API	3	1307	December 21, 2023

Same prompt and parameter settings produce different responses

Related topics