When using structured output via the API, the JSON validator rejects schemas where a $ref keyword has extra sibling keywords, like description:
{
type: object,
properties: {
foo: {
$ref: #/$defs/Foo,
description: A Foo object.
}
},
required: [foo],
additionalProperties: false,
$defs: {
Foo: {
type: object,
properties: {
id: { type: string }
},
required: [id],
additionalProperties: false
}
}
}
Will result in:
"openai.BadRequestError: Error code: 400 - {āerrorā: {āmessageā: āInvalid schema for response_format ājson_schemaā: context=(āpropertiesā, āfooā), $ref cannot have keywords {ādescriptionā}.ā, ātypeā: āinvalid_request_errorā, āparamā: ātext.format.schemaā, ācodeā: āinvalid_json_schemaā}}
This severely limits the expressivity of the schema. If a schema contains an object Foo and an object Bar, and Foo has a keyword of type Bar, you are currently unable to give the model guidance about the relationship between a Fooās Bar. You can provide a description for both the Foo object and the Bar object themselves, but not the relation. You could define the Foo-Bar relation in the descriptions of Foo and/or Bar themselves, but this breaks as soon as you introduce more objects that may also own a Bar.
This can be trivially worked around by wrapping the keyword in anyOf, but this isnāt an ideal long-term solution:
{
type: object,
properties: {
foo: {
anyOf: [
{ $ref: #/$defs/Foo }
],
description: A Foo object.
}
},
required: [foo],
additionalProperties: false,
$defs: {
Foo: {
type: object,
properties: {
id: { type: string }
},
required: [id],
additionalProperties: false
}
}
}