Structured Outputs Deep-dive

It seems this article’s reference to CFG, context-free grammar, and the flowchart illustration that starts it, is highly specific yet highly speculative. OpenAI doesn’t talk about how their JSON mode works.

In fact, the JSON response can be seen to be backwards-looking, or token-run based. Generate and extend your json-mode response one token at a time, by increasing max_token values, and you’ll see previous logits replaced with new ones. This may be switch transformer or mixture-of-experts architecture, highly suspected, at work.

If it was clever, the algorithm wouldn’t consider 500 tab characters or newlines to be a valid production of this mode, which is why first JSON had to be mentioned or the request would be rejected, and then now an output described to the AI by schema, which was a good practice before this.

Also, Pydantic is a mere client implementation possibility - it doesn’t have anything to do with what goes over the wire.

1 Like