Deterministically different responses when calling 3.5-turbo from different locations

I’m noticing a weird issue where with the same prompt, same seed, and temperature = 0, I’m getting different prompt completions when I send the request to GPT 3.5 turbo from 2 different locations.

The first location is my laptop, the second is a Github Actions virtual machine (which maybe lives in Azure and so might have some special connection to gpt?).

What’s weird is responses from both locations are also deterministic…and yet different from each other. I’ve hashed the input prompt to ensure the inputs are equal, and also made sure the system fingerprints are equal from both locations as well.

Can anyone explain this?

The next thought I have is libraries, and logging what is actually being sent to the AI.

Python has, delivering an APIresponse object that lets you retrieve the httpx request itself.

If the geography routes you to datacenters, one might suppose that there could be something different about the system random method where seed doesn’t take the same effect. An out-there theory. You can instead try top-p=0.0000001 without temperature or seed to force an answer as deterministic as possible.

I definitely cannot explain this. However… Azure instances will be on slightly different hardware and with slightly different systems generating and feeding the tokens in.

I do a lot of testing with small neural networks and I have noticed (confirmed) that running the exact same nn on different gpus alters the outputs. There’s a lot going on in the tails of the vectors (as you can see when you quantize llms) so floating point accuracy will affect how an nn processes data. This is not computing as we have experienced it in the past, it’s all probabilities so variability is inherent. My mind has come to think of it as ‘soft’ computing/programming. Soft as in squishy, and traditional programming (everything up until now) is ‘hard’ - when you ask for a zero, you get a zero. No ifs ands or buts - unless you made a mistake. That’s not how this stuff is in my experience.