Intermittent Multiple Responses in Single Output with gpt-5.2-chat-latest

I am reporting an issue where the gpt-5.2-chat-latest model intermittently generates multiple similar answers within a single API response. I am accessing the model via Azure OpenAI Service using LangChain.

Environment Details:

  • Platform: Azure OpenAI Service

  • Model Name: gpt-5.2-chat-latest

  • Framework: LangChain (ChatOpenAI / AzureChatOpenAI)

  • Method: Chat Completion

Observed Behavior:

  • Intermittency: The issue does not happen on every call but occurs intermittently.

  • Multiple Answers: When the issue occurs, the single response string contains 2 to 4 distinct answers to the same prompt.

  • Content: These multiple answers are very similar in content but appear sequentially within the same output block.

Steps to Reproduce:

  1. Initialize the LangChain Azure OpenAI client with gpt-5.2-chat-latest.

  2. Send a user query.

  3. Observe that occasionally the returned string contains 2 to 4 similar responses combined.

Sanitized Example (General Topic):
Below is an example of the behavior.

User Query:
“What are the benefits of drinking water?”

Model Response (Bugged):

### Benefits of Drinking Water
Staying hydrated is crucial for your health. It helps increase energy, improves skin complexion, and aids digestion. Remember to drink plenty of water daily.

### Benefits of Drinking Water
Water is essential for health. It boosts energy levels, keeps skin moisturized, and supports digestion. Aim for 8 glasses a day.

### Why Water is Important
Drinking water helps with energy, skin health, and digestion. It is vital for your well-being.

(Note: The model provides 2-4 similar answers in a row within one single response.)

When you interact with an AI, the “thought process” usually happens behind the scenes. However, what you are describing. Seeing four potential answers flowing past sounds like a peek into the inference and streaming process. It could be parallel sampling and beaming, streaming and latency problem, UI or UX testing. But; It seems highly probable that what was observed was a leak from the backend processes where the model generates several candidates in parallel. In the world of large language models, this is often linked to how the system evaluates the probability of a sequence of words.