Gpt-5.2 is not following the specified response format, and includes weird tokens. These tokens then derail our gpt-5.2 evaluator

Dave_Far · January 26, 2026, 11:00am

where: at completions API (hosted in EU)
time: January 23rd
model: “gpt-5.2”,
temperature: 0,
stream: true,
service_tier: “priority”,
reasoning_effort: “none”,
verbosity: “low”
system_message:
“”"
…You will interact … with the MediVoice API.
\n\n## MediVoice API\n``` * function msg(message: string): void: Talk to the caller using the TTS system.\n * function set(facts: Record<string, any>): any: Set facts…
“”"

Unexpected behavior: The model responds with “set({“phone”:“015144389161”}) 乐盈”, which does not follow the specified MediVoice API (which would be msg("乐盈") instead), switches the language, and gives a semantically nonsensical answer.

Furthermore, if we use our rating prompt (also through the API using gpt-5.2), we get the response
“”"
mistake: “#outputCooruptions#”,
reasoning: “After set({\"existingPatient\":true}) the assistant output contains an … \u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b\u000b…
“””
with “infinitely” many “\u000b"

This bad behavior never occurs when we use gpt-4.1.

Topic		Replies	Views
GPT-4-1106-preview has trouble response Chinese text in function calling API gpt-4 , api	1	1168	November 20, 2023
Randomized responses from Chat completions with gpt-4-0125-preview Bugs	0	105	March 10, 2025
GPT API Failed to create completion as the model generated invalid Unicode output API gpt-35-turbo , api	3	3851	April 1, 2024
GPT-5.2 occasionally fails to produce valid tool call Bugs bug , issue	1	154	February 6, 2026
Facing Instability in GPT-4o-Mini and GPT-4.1-Mini Models Bugs gpt-4 , chatgpt , api , assistants-api	0	82	February 19, 2026

Gpt-5.2 is not following the specified response format, and includes weird tokens. These tokens then derail our gpt-5.2 evaluator

Related topics