Invalid schema for response_format: schema must have a 'type' key

Hi, I’m currently using the chat completions API to extract a text field from an image.

This is what my Pydantic looks like:

class LineItem(BaseModel):
    description: str
    quantity: float
    price: float
    is_already_incurred: bool

class MiscellaneousLineCategory(Enum):
    CRANE: Enum
    PERMIT: Enum
    OTHER: Enum
    MATERIALS: Enum

class MiscellaneousLine(LineItem):
    category: MiscellaneousLineCategory

class GeneratedBidFromPdf(BaseModel):
    title: str
    notes: str
    miscellaneous_lines: list[MiscellaneousLine]
    llm_chain_of_thought: str

This is how I’m calling OpenAI:

def construct_request(encoded_images):
    """Constructs the content for the OpenAI API."""
    return {
        "model": "gpt-4o",
        "messages": [
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": USER_PROMPT},
            *[
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": f"data:image/png;base64,{image}",
                            },
                        },
                    ],
                }
                for image in encoded_images
            ],
        ],
        "temperature": 0,
        "seed": 42,
        "response_format": GeneratedBidFromPdf,
    }


def hit_openai(request):
    """Hits the OpenAI API with the encoded images."""

    response = client.beta.chat.completions.parse(  
        model=request["model"],
        messages=request["messages"],
        temperature=request["temperature"],
        seed=request["seed"],
        response_format=request["response_format"]
    )
    response_json = response.choices[0].message.parsed
    return response_json

I get the following error:

Error code: 400 - {'error': {'message': "Invalid schema for response_format 'GeneratedBidFromPdf': In context=('properties', 'category'), schema must have a 'type' key.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}

If I’m understanding this correctly, is OpenAI yelling at me because MiscellaneousLineCategory is not one of the accepted types?

The API doesn’t give enough information - and the SDK probably can’t do anything with the enum without any value…

Understanding Your Schema’s Motivation

Your schema appears designed to clearly structure an AI-generated bid, including line items, categories, and internal reasoning. The goal is likely to ensure the AI outputs structured, predictable data that your application can reliably parse and use.


Issue: Misuse of Enum (Use Literal Instead)

Your original schema incorrectly defined an Enum without actual values:

class MiscellaneousLineCategory(Enum):
    CRANE: Enum
    PERMIT: Enum
    OTHER: Enum
    MATERIALS: Enum

This won’t work because Enum members must have explicit values. However, for your use case—constraining the AI to a fixed set of strings—it’s simpler and clearer to use Literal:

Corrected Example:

from typing import Literal

class MiscellaneousLine(LineItem):
    category: Literal["crane", "permit", "other", "materials"]

This directly translates into a JSON Schema enum, clearly constraining the AI’s output.


Recommendation: Add Descriptions to Guide the AI

Adding descriptions helps the AI understand exactly what each field means, improving the accuracy and relevance of its structured responses.

Example with Descriptions:

from pydantic import BaseModel, Field
from typing import Literal, List

class LineItem(BaseModel):
    description: str = Field(..., description="Description of the line item.")
    quantity: float = Field(..., description="Quantity of the item.")
    price: float = Field(..., description="Price per unit of the item.")
    is_already_incurred: bool = Field(..., description="Whether the cost has already been incurred.")

class MiscellaneousLine(LineItem):
    category: Literal["crane", "permit", "other", "materials"] = Field(
        ..., description="Category of the miscellaneous line item."
    )

class GeneratedBidFromPdf(BaseModel):
    title: str = Field(..., description="Title of the generated bid.")
    notes: str = Field(..., description="Additional notes or comments.")
    miscellaneous_lines: List[MiscellaneousLine] = Field(
        ..., description="List of miscellaneous line items."
    )
    llm_chain_of_thought: str = Field(
        ..., description="Internal reasoning or chain-of-thought from the AI."
    )

Strictness is Automatic with Pydantic and the SDK

When you pass a Pydantic BaseModel to the OpenAI SDK as a response_format, the SDK automatically generates a strict JSON Schema:

  • All fields defined in your model are required by default.
  • No extra fields (additionalProperties) are allowed by default—no extra configuration needed.

Thus, your schema is already strict and well-defined once you use a properly structured Pydantic model. There’s no choice to be made about which fields are useful, they’ll all be filled out.

That will get the AI writing for you when you use the correct method and a supported model on recent SDK library:

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "An AI bidder"},
        {"role": "user", "content": text},
    ],
    response_format=GeneratedBidFromPdf,
)

I was able to get it working by doing the following:

class MiscellaneousLineCategory(Enum):
    CRANE = 1
    PERMIT = 2
    OTHER = 3
    MATERIALS = 4

Thanks boss!

You might have got it working - but it might not be “working”

Enum list of numbers will take even more description for the AI to emit them correctly.

This forms a Pydantic schema:

   "$defs": {
      "MiscellaneousLineCategory": {
         "enum": [
            1,
            2,
            3,
            4
         ],
         "title": "MiscellaneousLineCategory",
         "type": "integer"
      }

There’s nothing else for the SDK to transcribe about their usage.