Pydantic_function_tool vs response_format

Hi everyone !

I’m not sur to understand the different use case for pydantic_function_tool vs response_format when structured output.

Do you have information about best practice ?

from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

from openai import OpenAI
client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "system",
            "content": "Extract the names and ages of the people mentioned in the following text."
        },
        {
            "role": "user",
            "content": "John is 30 years old and his sister Alice is 25."
        }
    ],
    tools=[
        openai.pydantic_function_tool(Person)
    ]
)

print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)
from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class Step(BaseModel):
    explanation: str
    output: str

class MathReasoning(BaseModel):
    steps: list[Step]
    final_answer: str

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},
        {"role": "user", "content": "how can I solve 8x + 7 = -23"}
    ],
    response_format=MathReasoning,
)

math_reasoning = completion.choices[0].message.parsed

There’s no API Python SDK method for functions that takes a Pydantic BaseModel as input, in the same way that the response_format parameter can accept such a streamable class object when you use the parse() method.

For the convenience of also writing a function as a BaseModel instead of writing out a JSON schema specification for it, OpenAI gives the helper method for making properly formatted functions out of the Pydantic object you create with a function definition.

I still don’t understand when (which use case) use one or the other.

A function is a gateway to having the AI able to make use of the external world in an interactive way, through your code. It can call a function when it is useful, such as “post_message_to_openai_forum”, or “add_numbers_together()”, or “search_knowledgebase()”.

Then your code can return a value as an addition to a chat history and run the API call again, until the AI is done calling functions and says something like “I posted your answer to the OpenAI forum successfully.”

A list of tools the AI can use are specified through the tool parameter, only as JSON schema (schema-like Python object) with a metadata format containing the list of named functions.

That’s where if you tried to be fancy, using functions with Pydantic, in kind of a plug-in manner, that you’d need to transform the schema internal to BaseModel into something that can be sent with the tools parameter. That’s what the helper function is for, hiding deep in the SDK code if not for the documentation.


A structured output response format, however, is a mandatory JSON that the AI must output as a final response instead of natural language to a user.

With the Python SDK library, the API parameter itself can be passed a BaseModel class, and internal code will recognize JSON schema object or BaseModel, and will transform it to the required JSON that goes over the network.

Thank you for the clarification, if I understand, pydantic_function_tool will not always lead to the expected output JSON.

Even if the example on the output is likely to be structured as the pydantic class requires.

from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

from openai import OpenAI
client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "system",
            "content": "Extract the names and ages of the people mentioned in the following text."
        },
        {
            "role": "user",
            "content": "John is 30 years old and his sister Alice is 25."
        }
    ],
    tools=[
        openai.pydantic_function_tool(Person)
    ]
)

print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)
{
    'name': 'John',
    'age': 30
}

The LLM is free to behave outside the scope of this schema. Perhaps decide not to use the tool for some reason.

The object openai.pydantic_function_tool(Person)

{'type': 'function',
 'function': {'name': 'Person',
  'strict': True,
  'parameters': {'properties': {'name': {'title': 'Name', 'type': 'string'},
    'age': {'title': 'Age', 'type': 'integer'}},
   'required': ['name', 'age'],
   'title': 'Person',
   'type': 'object',
   'additionalProperties': False}}}

has for type ‘function’, maybe the llm thinks he is returning the args of some function using the class Person. Which will make this situation as a simple function calling situation.

You do not want to use tools or a tool call to simply ask the AI to deliver final results.

That is what a response format parameter and specification is for: Receiving the output of a fixed job you want the AI to do, where it produces validated JSON as its product.

The parse method shown works on response_format.

Plus, your object construction there is unsuitable.

Skip the pydantic for something so simple.

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4.1",
  messages=[
    {
      "role": "system",
      "content": [
        {
          "type": "text",
          "text": "Extract the primary person's information"
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "I'm Joe, age 29, and I hate broccoli."
        }
      ]
    }
  ],
  response_format={
    "type": "json_schema",
    "json_schema": {
      "name": "ai_response",
      "strict": True,
      "schema": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "description": "The name of the individual."
          },
          "age": {
            "type": "integer",
            "description": "The age of the individual."
          }
        },
        "required": [
          "name",
          "age"
        ],
        "additionalProperties": False
      }
    }
  },
  temperature=0.1,
)

With the depicted strict schema for output, the AI cannot produce anything other than the JSON.

After you’ve figured out how to even make a call, then you can use the parse() method and use a BaseModel class instead of the JSON schema.