Measuring Maximum Depth and Object Properties in Structured Outputs

The docs list two limitations on structured outputs:

A schema may have up to 100 object properties total, with up to 5 levels of nesting.

However, it is not clear how these criteria are measured. Take this official example:

{
    "name": "get_weather",
    "description": "Fetches the weather in the given location",
    "strict": true,
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The location to get the weather for"
            },
            "unit": {
                // highlight-start
                "type": ["string", "null"],
                // highlight-end
                "description": "The unit to return the temperature in",
                "enum": ["F", "C"]
            }
        },
        "additionalProperties": false,
        "required": [
            "location", "unit"
        ]
    }
}

What is the depth and total object properties here? Are name, description, and strict all counted as properties? Does the top-level properties already use 1 level of depth?

Looking at a second official example:

{
  "name": "query",
  "description": "Execute a query.",
  "strict": true,
  "parameters": {
    "type": "object",
    "properties": {
      "table_name": {
        "type": "string",
        "enum": ["orders"]
      },
      "columns": {
        "type": "array",
        "items": {
          "type": "string",
          "enum": [
            "id",
            "status",
            "expected_delivery_date",
            "delivered_at",
            "shipped_at",
            "ordered_at",
            "canceled_at"
          ]
        }
      },
      "conditions": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "column": {
              "type": "string"
            },
            "operator": {
              "type": "string",
              "enum": ["=", ">", "<", ">=", "<=", "!="]
            },
            "value": {
              "anyOf": [
                {
                  "type": "string"
                },
                {
                  "type": "number"
                },
                {
                  "type": "object",
                  "properties": {
                    "column_name": {
                      "type": "string"
                    }
                  },
                  "required": ["column_name"],
                  "additionalProperties": false
                }
              ]
            }
          },
          "required": ["column", "operator", "value"],
          "additionalProperties": false
        }
      },
      "order_by": {
        "type": "string",
        "enum": ["asc", "desc"]
      }
    },
    "required": ["table_name", "columns", "conditions", "order_by"],
    "additionalProperties": false
  }
}

A valid index into this object is:

.parameters.properties.conditions.items.properties.value.anyOf[2].properties.column_name

Which is a greater depth than 5, so I must be measuring this criteria incorrectly.

The API (even in strict mode) does not seem to respond with helpful error messages when you reach these limits, just a general error that the provided JSON schema is invalid. Any guidance here?

Hi @peter.edmonds and welcome to the forums!

Have you gone much further with this?

I tested a “dummy” Pydantic HTML schema with 6-7 object levels and I didn’t get any issues either. What did your schema look like when you received those errors, and what was the exact error message?

In terms of guidance: I try to avoid deep and nested schemas, i.e. I would always look at the best way to transpose my schema in such a way that it’s as flat as possible. Similarly with properties - I try to avoid too many, and instead use arrays and enums as much as possible.