Structured outputs is the way to go if you have complex records and rigid requirements for outputs. The tech under the hood for it is not just a fine tune… OpenAI actually ensures the production of each token follows the pattern you define inside the transformer itself, so your output structure is guided during the generation process, not as an output filter.
Another option is to make your output record definition look like a tool call… the model thinks its calling a tool but you use the tool parameter list as a structured output instead. Gets you some of the benefits and arguably more flexibility since you can still get other modalities of output tokens.
If you want to stick to the approach you have now, just ask the model to generate json, here are some of my best practices:
- use the chat API and treat each request like a multi-turn conversation with built-in checks that iterates until the json is valid. Put your record format in the system/developer prompt, then your initial request in the first user prompt. If the response is valid then use it but if not, add a second user turn to the conversation with the specifics of what’s wrong (which you can generate using code). This way, the model sees what you want, what it did last turn, what was wrong with that, and provides a fix. Rinse and repeat.
- ask it to put clear delinetors around the json (tripple back-tick or angle brackets… just something you can use to parse the json out of all the other tokens the model might stream at you as text) and reject the output (by adding an automated user turn to the converstation) that tells the model it forgot the delineators
- parse the json and if its not valid, add a automated user turn with whatever the json parser says is wrong. Lots of times there’s a python-formatted “None” where json wants a null, or a missing escaped quote or whatever. just use the error the json parser gives you.
- validate the parsed json record and bounce it back to the model with an automated ‘that looks like good json but field X can only have one of three values A, B, C etc.
I only use regex to parse streaming json that lets me tell the user what’s going on before the model is done. For long records, I don’t want to wait without giving the user feedback so I need to be able to recognize what the model is streaming and tell the user something like “One moment, field 5 ot 12 is complete…”
that last approach ends up looking like
system: please make an array of recods in json for any dates mentioned in the following text. each record should have the fields “year” and “month” as strings. mark the start of your json with JSON: and end with a new line. today is October 2, 2025.
user: This time last year, the leaves were more colorful. Now its dry like 5 years back
assistant: hmmm… let me think about that JSON:[{\“year\”:2024,\”month\”:\”october\”},{\“year”:2020,\”month\”:\”october\”}]
user: unexpected end of file at position 79. did you forget to close a quiote?
assistant: oh! yeah one sec:[{\“year\”:2024,\”month\”:\”october\”},{\“year\”:2020,\”month\”:\”october\”}]
user: No JSON start token was found. Include JSON: and terminate with a new line
assistant: JSON:[{\“year\”:2024,\”month\”:\”october\”},{\”month\”:\”october\”}]
user: invalid record 0 “year” should be a string. Invalid record 1 “year” missing
assistant: gosh you are so right thanks for patience here we go JSON:[{\“year\”:2024,\”month\”:\”october\”},{\“year\”:2020,\”month\”:\”october\”}]
(all tests pass)