Structured Outputs Deep-dive

Great @jim ! Let me know how the results look like with this two-phase approach!

Thanks for the write-up @platypus. Can you provide a source for the 4096 output token limit? According to the release blog post, structured outputs with response formats is only available on gpt-4o-2024-08-06 and gpt-4o-mini, both of which support 16,384 tokens.

1 Like

Great catch @peter.edmonds ! I will make a correction! My intention was to make sure users understand that there is no “safety” mechanism - if you went over max output token limit count (whatever that is, in this case 16384), you will get incomplete JSON back. Thanks again!

Done @peter.edmonds ! I made an attribution to you there, thanks again!

Thanks for the update @platypus. Regarding the lack of safety mechanism, I’m not sure if that is the case. I’ve always been returned an error rather than malformed JSON when my response exceeds the context limit. This could be coming from the Zod + NodeJS SDK I’ve been using though, curious if you have seen this in the wild.

@peter.edmonds in my case I have received incomplete JSON responses - I used Python + Batch API.

@peter.edmonds when I say “safety mechanism”, I mean from the API side. Yes if I load those into a dict in Python I will get an error (because the JSON is incomplete). Similarly if I use Pydantic. But this is from the SDK, not the API.

Please feel free to redirect to the appropriate forum : Has anyone experienced regression in the quality of the answers because of Structured Outputs ? Answers that were once elaborate and detailed are now short and succinct which may not be desired in every use case?

1 Like

Yes, absolutely. Quality of SO has declined in the last week including weaker, shorter responses and hallucinated ENUM’d values. I see this across all models, with an uptick on gpt-4o-mini.

2 Likes

wow, I am glad I am not hallucinating. We are on gpt-4o, didn’t want to switch to mini prematurely. Structured Outputs is a great feature , however, I wish there is a feature to retain the original chatCompletions responses and only use Structured Outputs sparingly for parts that need to fit into a schema for post processing in function/tools calling.

1 Like

(Thank you so much @platypus for this great and timely article!)

I have a question I hope I can ask here that might be relevant to this thread:

My task involves summarization, and I need the output to conform to an existing PDF file. I used playground to read the PDF and output the schema… but I want to implement a data-driven way to provide the output schema to the model at run-time so I can change it on the fly.

I thought first to specify the schema as JSON for file storage, then load it into pydantic. But would it be smarter to ask gpt to provide python/pydantic, then use pydantic to serialize to JSON for file storage? I thought that might eliminate any schema-driven de-serialization problems.

In short, are there any best practices for getting a model to output a structured output schema that you plan to load at runtime from a file?

2 Likes

Glad you liked the article @oldtimehacker and welcome to the community!

So if I understood correctly, you want to have a dynamic representation of a document, and your question is regarding the best practices for schema storage?

So regarding schema storage, it may be best actually to define a JSON schema as a text file, i.e. document_summary_schema.json since you have it nice and neat in one place and you can version control it and update it accordingly. Then in your code you would just have schema validation code (e.g. you can load it using Pydantic and perform validation steps).

Regarding dynamic schema generation - it leaves you a bit exposed, because you never know what you may get and you will end up doing lots of validation code. Maybe that’s fine. But another approach is to have optional fields. So presumably in a given document summary schema you have some base information: title, authors, date_published, keywords, executive_summary, etc. Then you basically create an optional field, which would be an array of objects. And each object would have something along the lines of section_heading and section_summary. So maybe you have some PDF that has Appendix A Blah blah blah and that end up being populated as one of these optional fields.

Hope that makes sense :sweat_smile: