GPT Behavior Changes in Azure OpenAI Despite Identical Setup

Hello,

I’ve encountered a recurring issue with GPT in Azure OpenAI. The behavior of the model seems to change from week to week, even though I’m using the same model (gpt-4o-2024-08-06) under identical conditions.

Here’s the context:

  1. Consistent Prompts: The prompts I use haven’t changed.
  2. Stable Environment: Python package versions remain the same.
  3. Model Temperature: Set to 0 for deterministic outputs.
  4. Tested Prompts: I’ve tested these prompts with multiple shot configurations in the past, and they provided consistent results.

However, week by week, the model responds differently to the same inputs. This behavior impacts my use case for routing or generating responses, forcing me to tweak prompts regularly because certain cases no longer produce the expected results.

Is it normal for the model to exhibit such variability? Are there any updates or changes being made to the model behind the scenes that might affect this?

I would appreciate any insights or clarification on this matter.

Thank you!