Same prompt and parameter settings produce different responses

We are prototyping natural language question to SQL generation using code-Davinci-002. We observe a strange behavior. Under the same conditions [same schema input, same parameter settings (specifically temperature is set to zero)], two different developers are are getting two very different answers. The only difference that I can think of that exists is the developer locations.

The diffs are as follows:

  1. The generated SQLs are different.
  2. The returned response for one developer includes the schema itself while for the second developer, codex repeats the query over and over again till hitting the token limit.

Any thoughts on how to fix this issue?

I would try exposing the probabilities of the tokens, might help with debugging.

1 Like

Increasing frequency penalty can help to prevent repetition. Also have a look at this topic to understand why it can’t be fully deterministic.

Thanks, will take a look.

Thanks. Will take a look at the frequency penalty. I fully understand the non-determinism in parallelization. Codex behavior seems to be deterministic within my own sandbox environment: same input produces same output on repeated testing. The non-determinism becomes immediately apparent across two developer sandboxes.