Invalid schema for response_format: schema must have a 'type' key

nikhil.gajghate · March 19, 2025, 4:26pm

Hi, I’m currently using the chat completions API to extract a text field from an image.

This is what my Pydantic looks like:

class LineItem(BaseModel):
    description: str
    quantity: float
    price: float
    is_already_incurred: bool

class MiscellaneousLineCategory(Enum):
    CRANE: Enum
    PERMIT: Enum
    OTHER: Enum
    MATERIALS: Enum

class MiscellaneousLine(LineItem):
    category: MiscellaneousLineCategory

class GeneratedBidFromPdf(BaseModel):
    title: str
    notes: str
    miscellaneous_lines: list[MiscellaneousLine]
    llm_chain_of_thought: str

This is how I’m calling OpenAI:

def construct_request(encoded_images):
    """Constructs the content for the OpenAI API."""
    return {
        "model": "gpt-4o",
        "messages": [
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": USER_PROMPT},
            *[
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": f"data:image/png;base64,{image}",
                            },
                        },
                    ],
                }
                for image in encoded_images
            ],
        ],
        "temperature": 0,
        "seed": 42,
        "response_format": GeneratedBidFromPdf,
    }


def hit_openai(request):
    """Hits the OpenAI API with the encoded images."""

    response = client.beta.chat.completions.parse(  
        model=request["model"],
        messages=request["messages"],
        temperature=request["temperature"],
        seed=request["seed"],
        response_format=request["response_format"]
    )
    response_json = response.choices[0].message.parsed
    return response_json

I get the following error:

Error code: 400 - {'error': {'message': "Invalid schema for response_format 'GeneratedBidFromPdf': In context=('properties', 'category'), schema must have a 'type' key.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}

If I’m understanding this correctly, is OpenAI yelling at me because MiscellaneousLineCategory is not one of the accepted types?

_j · March 19, 2025, 6:25pm

The API doesn’t give enough information - and the SDK probably can’t do anything with the enum without any value…

Understanding Your Schema’s Motivation

Your schema appears designed to clearly structure an AI-generated bid, including line items, categories, and internal reasoning. The goal is likely to ensure the AI outputs structured, predictable data that your application can reliably parse and use.

Issue: Misuse of Enum (Use Literal Instead)

Your original schema incorrectly defined an Enum without actual values:

class MiscellaneousLineCategory(Enum):
    CRANE: Enum
    PERMIT: Enum
    OTHER: Enum
    MATERIALS: Enum

This won’t work because Enum members must have explicit values. However, for your use case—constraining the AI to a fixed set of strings—it’s simpler and clearer to use Literal:

Corrected Example:

from typing import Literal

class MiscellaneousLine(LineItem):
    category: Literal["crane", "permit", "other", "materials"]

This directly translates into a JSON Schema enum, clearly constraining the AI’s output.

Recommendation: Add Descriptions to Guide the AI

Adding descriptions helps the AI understand exactly what each field means, improving the accuracy and relevance of its structured responses.

Example with Descriptions:

from pydantic import BaseModel, Field
from typing import Literal, List

class LineItem(BaseModel):
    description: str = Field(..., description="Description of the line item.")
    quantity: float = Field(..., description="Quantity of the item.")
    price: float = Field(..., description="Price per unit of the item.")
    is_already_incurred: bool = Field(..., description="Whether the cost has already been incurred.")

class MiscellaneousLine(LineItem):
    category: Literal["crane", "permit", "other", "materials"] = Field(
        ..., description="Category of the miscellaneous line item."
    )

class GeneratedBidFromPdf(BaseModel):
    title: str = Field(..., description="Title of the generated bid.")
    notes: str = Field(..., description="Additional notes or comments.")
    miscellaneous_lines: List[MiscellaneousLine] = Field(
        ..., description="List of miscellaneous line items."
    )
    llm_chain_of_thought: str = Field(
        ..., description="Internal reasoning or chain-of-thought from the AI."
    )

Strictness is Automatic with Pydantic and the SDK

When you pass a Pydantic BaseModel to the OpenAI SDK as a response_format, the SDK automatically generates a strict JSON Schema:

All fields defined in your model are required by default.
No extra fields (additionalProperties) are allowed by default—no extra configuration needed.

Thus, your schema is already strict and well-defined once you use a properly structured Pydantic model. There’s no choice to be made about which fields are useful, they’ll all be filled out.

That will get the AI writing for you when you use the correct method and a supported model on recent SDK library:

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "An AI bidder"},
        {"role": "user", "content": text},
    ],
    response_format=GeneratedBidFromPdf,
)

nikhil.gajghate · March 19, 2025, 7:44pm

I was able to get it working by doing the following:

class MiscellaneousLineCategory(Enum):
    CRANE = 1
    PERMIT = 2
    OTHER = 3
    MATERIALS = 4

Thanks boss!

_j · March 19, 2025, 7:57pm

You might have got it working - but it might not be “working”

Enum list of numbers will take even more description for the AI to emit them correctly.

This forms a Pydantic schema:

   "$defs": {
      "MiscellaneousLineCategory": {
         "enum": [
            1,
            2,
            3,
            4
         ],
         "title": "MiscellaneousLineCategory",
         "type": "integer"
      }

There’s nothing else for the SDK to transcribe about their usage.

Topic		Replies	Views
Structured Output: Error "Invalid schema for response_format " persists even for valid json schema API api , chat-completion , structured-output , nodejs	2	481	January 26, 2025
Structured Response: enums not supported in with Pydantic schema generation Bugs	13	2714	September 20, 2024
OpenAI Python client fails to parse Pydantic model Bugs	4	885	March 7, 2025
Unexpected additionalProperties requirement for nested Pydantic models in response_format API	2	393	March 25, 2025
Enums in structured output API	1	826	February 28, 2025

Invalid schema for response_format: schema must have a 'type' key

Understanding Your Schema’s Motivation

Issue: Misuse of Enum (Use Literal Instead)

Recommendation: Add Descriptions to Guide the AI

Strictness is Automatic with Pydantic and the SDK

Related topics