Trying to use structured output as part of payload for an API request

Hello ! I need some help using structured outputs. I am trying to process the Table of Contents (TOC) of a corpus of pdf documents. I have a simple json schema called TOC Response, like so:

class TOCEntry(BaseModel):
entry: str # Define the ‘entry’ field as a string
nom_page: int # Define the ‘nom_page’ field as an integer

class TOCResponse(BaseModel):
toc_entries: List[TOCEntry]

I am sending messages to API as part of a payload (below), but when I include response_format in the payload, I get an error message saying “Object of type ModelMetaclass is not JSON serializable”.

payload = {
    "model": "gpt-4o-2024-08-06",
    "messages": messages,
    "max_tokens": 4096,
    "temperature": 0.7,
    "top_p": 1.0,
    "response_format": TOCResponse,
    "frequency_penalty": 0.0,
    "presence_penalty": 0.0
}

My question is how can I request the API to use response_format if I cannot include it in the payload, nor in the response -

response = requests.post(“https://api.openai.com/v1/chat/completions”, headers=headers, json=payload,response_format=TOCResponse)

as I get a different error -Session.request() got an unexpected keyword argument ‘response_format’

Thank you for any suggestions.

Hi! To use pydantic objects you’ll need to use the openai sdk directly, instead of requests.post()… like the python example here: https://platform.openai.com/docs/guides/structured-outputs/introduction?lang=python

1 Like

You can also do it with pydantic should you need to avoid using the openai client for some reason.

from typing import override

import pydantic


class OpenAiResponseFormatSchemaGenerator(pydantic.json_schema.GenerateJsonSchema):
    @override
    def generate(self, schema, mode):
        json_schema = super().generate(schema, mode)
        return {
            "type": "json_schema",
            "json_schema": {
                "name": json_schema.pop("title"),
                "strict": True,
                "schema": json_schema,
            },
        }


class StrictModel(pydantic.BaseModel):
    model_config = {"extra": "forbid"}


class PeopleExtractor(StrictModel):
    class Person(StrictModel):
        name: str
        age: int

    people: list[Person]


response_format_schema = PeopleExtractor.model_json_schema(
    schema_generator=OpenAiResponseFormatSchemaGenerator
)

Thank you so much ! I am afraid my code is a bit clunky as I jerry rigged from my earlier code to insert the response_format into the payload I was using, please see below, but it appears to be working in that it provides me with the json I need. It seems very expensive in terms of tokens and charges though, if there is a way to economize I would truly appreciate as I need to process hundreds of documents :

Create the payload for the API request

    payload = {
        "model": "gpt-4o-2024-08-06",
        "messages": messages,
        "max_tokens": 4096,
        "temperature": 0.2,
        "top_p": 1.0,
        "frequency_penalty": 0.0,
        "presence_penalty": 0.0,
        "response_format": {
            "type": "json_schema",
            "json_schema": {
                "schema": {
                    "$defs": {
                        "TOCEntry": {
                            "properties": {
                                "entry": {"title": "Entry", "type": "string"},
                                "nom_page": {"title": "Nom Page", "type": "integer"},
                                "level": {"title": "Level", "type": "integer"}
                            },
                            "required": ["entry", "nom_page", "level"],
                            "title": "TOCEntry",
                            "type": "object",
                            "additionalProperties": False
                        }
                    },
                    "properties": {
                        "toc_entries": {
                            "items": {"$ref": "#/$defs/TOCEntry"},
                            "title": "Toc Entries",
                            "type": "array"
                        }
                    },
                    "required": ["toc_entries"],
                    "title": "TOCResponse",
                    "type": "object",
                    "additionalProperties": False
                },
                "name": "TOCResponse",
                "strict": True
            }
        }
    }

Resolving the $refs and writing inlined schema would save you about 30% tokens on the schema itself. Other than that you should try 4o-mini