Harmony-based GPT-5 models return malformed structured outputs (SDK ≥ 1.100.2)

abdullin · August 22, 2025, 9:18am

Hi, OpenAI Support bot recommended me to share my findings in this community.

I’m getting random failures with malformed JSON when calling any GPT-5 model via client.beta.chat.completions.parse with a given response_format. Errors are sporadic, but the linked gist always ends up with an error before it finishes executing demo. Usually it happens after the model went through a few reasoning steps on a task (and collected a few messages in the conversation). But with base gpt-5, I’ve seen it to happen straight away as well.

Switching model to gpt-4o eliminates the problems completely.

The concrete exception message is: Invalid JSON: trailing characters at line 2 column 1 [type=json_invalid, input_value='{"current_state":"Need ``t...,"email":"elon@x.com``"}}', input_type=str]

Basically OpenAI logic concatenates multiple JSONs with a newline, resulting in a malformed JSON that fails pydantic parsing.

I think, the root cause is interplay between the new Harmony response format in gpt-5 models and Schema-Guided Reasoning that relies on response formats to drive reasoning through the predefined paths. This improves cognitive capabilities of smaller models in predefined tasks (used mostly by teams developing with local models), but seems to trigger an edge case in GPT-5 series.

Gist to reproduce the problem (it also contains example console output with a stack trace) is here: https://gist.github.com/abdullin/332b03de6b86a134eedbc2e4b8379736#file-error_output-txt-L54-L85

The issue has been independently reproduced in our community via this SGR Demo gist and its modifications.

Has anybody encountered the same issue before? How do you work around them?

Best,
Rinat

abdullin · August 22, 2025, 1:38pm

Update: I tried prepending following to the system prompt to disable reasoning and see if that helps. Still hitting malformed JSON.

Active channels: final
Disabled channels: analysis, commentary

jm4875 · September 12, 2025, 11:23am

Exact same issue here, with gpt-5-2025-08-07 on Azure OpenAI.

The LLM generates a json object twice on two different lines:

{“operation”:{“updates”:[truncated…]}}
{“operation”:{“updates”:[truncated…]}}

Which causes this error:

  File "/app/.venv/lib64/python3.12/site-packages/pydantic/main.py", line 746, in model_validate_json
    return cls.__pydantic_validator__.validate_json(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for Operation
  Invalid JSON: trailing characters at line 2 column 1 [type=json_invalid, input_value='{"operation":{"updates":...t intensities; and"}]}}', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/json_invalid

Using python openai 1.106.1

abdullin · September 12, 2025, 4:23pm

@jm4875 Thanks for the report! This is the same behaviour that we get.

Is this caused by a plain SO prompt or are you using SGR/SO CoT or its equivalent to drive reasoning?

jm4875 · September 12, 2025, 6:52pm

Basic structured output prompt, with some function calling.

Using chat completions API

abdullin · September 16, 2025, 6:49am

Thanks a lot, @jm4875 I’m glad to hear that the case is reproducible by multiple parties.

Have you seen any patterns on what can be causing this problem? Or ideas how to fix that?
Patching OpenAI SDK to detect and remove duplicate JSON feels to be hacky.

jm4875 · September 16, 2025, 7:08am

Appears when using gpt5 + function calling

No idea how to fix. I improved the prompting and I’ll see in the next few days if it’s better…

Yes one way to fix that would be to post-processed the openai json text completion, and call .model_validate_json() ourselves

abdullin · November 28, 2025, 10:14am

I just got a report that the issue still persists on Microsoft Azure OpenAI with gpt-5-mini.

It is caused by JSON duplication.

If this happens, the hack is to intercept OpenAI responses within the SDK before the parsing (e.g. with httpx interceptor) and remove second duplicate line.

Topic		Replies	Views
Response_format=json_object returns invalid json with finish_reason=stop Bugs json-mode	7	829	January 7, 2025
GPT-4 in json mode intermittently returning answer twice Bugs gpt-4 , api , json-mode	3	1339	March 12, 2024
Web Search Completion Cuts Off Response and ignores structured outputs on complex prompts API api , structured-output	8	983	December 26, 2025
Issue with GPT-4o-2024-08-06: Repeating Tokens in Structured Output API gpt-4o	1	613	February 26, 2025
Response API - Duplicate response and Invalid JSON Bugs responses-api	5	177	September 3, 2025

Harmony-based GPT-5 models return malformed structured outputs (SDK ≥ 1.100.2)

Related topics