Improve the error messages for JSON Schema Structured Output

Currently I’m trying to get the AI to return me OpenAPI specs.

To do so, I pass the OpenAPI JSON Schema spec as the reponse_format parameter.

The OpenAPI spec in its raw form violates a lot of the constraints that OpenAI has.

So I’m going through an modifying my schema, making fields required and nullable, adding ‘additionalProperties’: false, etc.

Unhelpfully though - the error message doesn’t provide information about what the violation is. We just see:

      error: {
  message: "Invalid schema for response_format 'openapi_spec_modified': In context=(), object schema missing properties.",
  type: "invalid_request_error",
  param: "response_format",
  code: null,
},
       code: null,
      param: "response_format",
       type: "invalid_request_error",

It would be nice to add more details to this error message.

The response_format parameter is for supplying a JSON schema, as the only way the AI can form a response to you, the AI model forced to fill the fields of the structured output with the data apparently needed.

“Return me OpenAPI specs” is not what it is for.

If you’d like a validator of JSON schema that will alert you to the positional error where parsing first fails, you can use the “chat” API playground. and add a response_format of json_schema there. To get an idea the required format besides the schema itself, you can use “generate” in the playground and type to the nice AI what kind of output you need the chat AI to respond with.

Here’s a response_format just asking for the keys displayed:

{
  "name": "classification_output",
  "schema": {
    "type": "object",
    "properties": {
      "document_type": {
        "type": "string",
        "description": "The type of the document."
      },
      "document_length": {
        "type": "number",
        "description": "The length of the document in words."
      },
      "document_retitled": {
        "type": "string",
        "description": "The title of the document after classification."
      },
      "document_keywords": {
        "type": "array",
        "description": "Keywords associated with the document classification.",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "document_type",
      "document_length",
      "document_retitled",
      "document_keywords"
    ],
    "additionalProperties": false
  },
  "strict": true
}

The response_format parameter is for supplying a JSON schema , as the only way the AI can form a response to you, the AI model forced to fill the fields of the structured output with the data apparently needed.

That’s right. An OpenAPI spec can be described by a JSON schema.

Without it (just using response_format.type=“json_object”) , I was finding that the AI was sometimes returning OpenAPI-looking specs, that were actually badly formed.

Unfortunately, I can’t just bung the actual OpenAPI JSON Schema in, I need to pare it down to fit with OpenAI. But the paring process is proving more difficult than needed with somewhat opaque error messages.

Just be aware that what I show above demonstrates the need for a meta-object that the AI-received schema is placed within:

{
  "name": "classification_output",
  "schema": {},
  "strict": true
}

Then the next nest level, the actual “schema” object (and it must begin with object, not array, not anyOf,…) is how the AI response is specified.

Strict:true must build an enforceable grammar, so is limited to a deterministic depth level, all keys required on all objects, limited enum counts, etc.

I think playground outputs the description of the error when you pass your schema in configuration. But I may be wrong.