Several posts have mentioned that certain tools (e.g. Pydantic) generate a valid JSON schema that is somewhat similar, but distinct, from the example schema shown in the docs.
As far as I can tell, aside from that one example in the docs (shown below) and a few in the cookbook, no official guidance has been provided on how strictly this schema must be followed.
OpenAI docs schema example:
from openai import OpenAI
client = OpenAI()
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
}
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
completion = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto"
)
Contrasting this with a schema generated by Pydantic instead:
class GetCurrentWeather(BaseModel):
"""Get the current weather in a given location"""
location: str
unit: Literal["celsius","farenheit"] | None = None
g = GetCurrentWeather(location="san francisco",unit='farenheit')
g.model_json_schema()
>>>
{'description': 'Get the current weather in a given location',
'properties': {'location': {'title': 'Location', 'type': 'string'},
'unit': {'anyOf': [{'enum': ['celsius', 'farenheit'],
'type': 'string'},
{'type': 'null'}],
'default': None,
'title': 'Unit'}},
'required': ['location'],
'title': 'GetCurrentWeather',
'type': 'object'}
The results have some high-level similarities, but overall the structure, nesting, included fields etc is quite different.
Empiricially this works fine, but it’s essentially “undocumented” behavior since the docs specify a different schema. My question then is - is there really a specific schema required, or is it enough to simply provide valid JSON that would have fields relating to arguments, the name of the function, descriptions, etc.
If there is a specific schema required, where is it documented in full detail, if at all, and is the fact that other schemas work essentially a “lucky accident”?