Enums in structured output

I am getting a json schema error when including an enum of objects - i.e. a fairly minimal version:

class Y(BaseModel):
    a : int
    b :int

class Z(BaseModel):
    q: Literal[Y(a=1,b=2), Y(a=3,b=3)]

# Z.model_json_schema()

#{'properties': {'q': {'enum': [{'a': 1, 'b': 2}, {'a': 3, 'b': 3}],
#  'title': 'Q'}},
# 'required': ['q'],
#  'title': 'Z',
# 'type': 'object'}

text = "In this example, a is 3 and so is b"

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Extract the data from the text"},
        {"role": "user", "content": "<info>{info}</info>".format(info=text)},
    ],
    response_format=Z,
)

#  BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'Z'. Please ensure it is a valid JSON Schema.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}

Is this a known limitation (or am I making some mistake)?

I really can’t even comprehend what you are trying to do here. The text message certainly doesn’t clue us in.

The schema has you trying to set enums of two particular objects that could be replied with:

{
  "q": {
    "a": 1,
    "b": 2
  }
}

But you want b to depend on a. That’s just not how the AI model and construction of a context-free grammar works.

Basically, the only way you could do this is with two sub-schemas the AI could choose from as anyOf, where “q” has number values with a single enum for each, and perhaps a second “r” schema has the alternate enum number values. Then you’d rely on the AI to pick the right schema before it even produces the numbers.

Here’s a working schema that gives you either a “q” or a “r” response (the other null). Each has the values constrained to just one set.

import openai
client = openai.Client()

from pydantic import BaseModel, ConfigDict
from typing import Literal, Union
import json

# First schema: q with a=1, b=2
class Q(BaseModel):
    a: Literal[1]
    b: Literal[2]
    model_config = ConfigDict(extra='forbid')

# Second schema: r with a=3, b=3
class R(BaseModel):
    a: Literal[3]
    b: Literal[3]
    model_config = ConfigDict(extra='forbid')

# Top-level schema using Union (anyOf)
class Z(BaseModel):
    q: Union[Q, None] = None
    r: Union[R, None] = None

    model_config = ConfigDict(extra='forbid')

print(json.dumps(Z.model_json_schema(), indent=3))

text = "In this example, a is 3 and so is b"

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Extract the data from the text"},
        {"role": "user", "content": text},
    ],
    response_format=Z,
)
completion.choices[0].message.parsed

Z(q=None, r=R(a=3, b=3))

Still makes no sense to me or an AI, but runs.

Takeaway: enum: for type keywords such as “string” or “number”.