We are prototyping natural language question to SQL generation using code-Davinci-002. We observe a strange behavior. Under the same conditions [same schema input, same parameter settings (specifically temperature is set to zero)], two different developers are are getting two very different answers. The only difference that I can think of that exists is the developer locations.
The diffs are as follows:
The generated SQLs are different.
The returned response for one developer includes the schema itself while for the second developer, codex repeats the query over and over again till hitting the token limit.
Thanks. Will take a look at the frequency penalty. I fully understand the non-determinism in parallelization. Codex behavior seems to be deterministic within my own sandbox environment: same input produces same output on repeated testing. The non-determinism becomes immediately apparent across two developer sandboxes.