Tips for improving GPT-5 JSON output consistency?

Hi everyone :waving_hand:
I recently started working with the GPT-5 API and noticed that sometimes the JSON responses are inconsistent (missing brackets or unexpected formatting).

I’d love to hear how other developers handle this.

  • Do you use regex cleanup, schema validation, or function calling?

  • Any best practices you recommend for production apps?

Thanks in advance for sharing your experience :folded_hands:

1 Like

It does rather seem like gpt-5 is a model where there is a sack of words and symbols, and an AI claw machine reaches in and picks one nearly at random.

You will get much better production of language, a natural ability to generate the correct thing, when you employ a model such as gpt-4.1. First, with no reasoning where it would self-generate its own distraction, and secondly, that you can use the API parameter “top_p” at a low setting such as 0.01 to have the AI model’s sampler pick token candidates only from the top 1% of probability distribution.

If your final product is a JSON with a structure that shall be programmatically enforced and must validate, you would want to employ a strict structured output schema as the response format on the API. On the responses API, this goes in a parameter text…format with type:json_schema.

https://platform.openai.com/docs/guides/structured-outputs

Unfortunately, that guide from OpenAI pushing you to proprietary methods obfuscates that you are creating a JSON draft schema, placed in a container with fields like “strict” or “name”. Then all of the JSON schema’s fields in a “required” list, and also with “additionalProperties”: false at each nest level. You’d have to see the CURL example to have a clue what’s going on.

Or, a handy helper that will produce the API’s required format that I’ve got over here:

Functions are for mid-reply actions the AI can take, not for producing a final deliverable.

1 Like

Structured outputs is the way to go if you have complex records and rigid requirements for outputs. The tech under the hood for it is not just a fine tune… OpenAI actually ensures the production of each token follows the pattern you define inside the transformer itself, so your output structure is guided during the generation process, not as an output filter.

Another option is to make your output record definition look like a tool call… the model thinks its calling a tool but you use the tool parameter list as a structured output instead. Gets you some of the benefits and arguably more flexibility since you can still get other modalities of output tokens.

If you want to stick to the approach you have now, just ask the model to generate json, here are some of my best practices:

  • use the chat API and treat each request like a multi-turn conversation with built-in checks that iterates until the json is valid. Put your record format in the system/developer prompt, then your initial request in the first user prompt. If the response is valid then use it but if not, add a second user turn to the conversation with the specifics of what’s wrong (which you can generate using code). This way, the model sees what you want, what it did last turn, what was wrong with that, and provides a fix. Rinse and repeat.
  • ask it to put clear delinetors around the json (tripple back-tick or angle brackets… just something you can use to parse the json out of all the other tokens the model might stream at you as text) and reject the output (by adding an automated user turn to the converstation) that tells the model it forgot the delineators
  • parse the json and if its not valid, add a automated user turn with whatever the json parser says is wrong. Lots of times there’s a python-formatted “None” where json wants a null, or a missing escaped quote or whatever. just use the error the json parser gives you.
  • validate the parsed json record and bounce it back to the model with an automated ‘that looks like good json but field X can only have one of three values A, B, C etc.

I only use regex to parse streaming json that lets me tell the user what’s going on before the model is done. For long records, I don’t want to wait without giving the user feedback so I need to be able to recognize what the model is streaming and tell the user something like “One moment, field 5 ot 12 is complete…”

that last approach ends up looking like

system: please make an array of recods in json for any dates mentioned in the following text. each record should have the fields “year” and “month” as strings. mark the start of your json with JSON: and end with a new line. today is October 2, 2025.

user: This time last year, the leaves were more colorful. Now its dry like 5 years back

assistant: hmmm… let me think about that JSON:[{\“year\”:2024,\”month\”:\”october\”},{\“year”:2020,\”month\”:\”october\”}]

user: unexpected end of file at position 79. did you forget to close a quiote?

assistant: oh! yeah one sec:[{\“year\”:2024,\”month\”:\”october\”},{\“year\”:2020,\”month\”:\”october\”}]

user: No JSON start token was found. Include JSON: and terminate with a new line

assistant: JSON:[{\“year\”:2024,\”month\”:\”october\”},{\”month\”:\”october\”}]

user: invalid record 0 “year” should be a string. Invalid record 1 “year” missing

assistant: gosh you are so right thanks for patience here we go JSON:[{\“year\”:2024,\”month\”:\”october\”},{\“year\”:2020,\”month\”:\”october\”}]

(all tests pass)