I'm experiencing inconsistent outputs from GPT-4 on Azure OpenAI, even with temperature set to 0 and no changes to my prompts or environment. Could these variations be attributed to unseen updates or bug fixes rolled out by OpenAI?

If this is the case, how can developers effectively manage this lack of immutability and ensure consistent performance in production applications?