Quirk in OpenAI’s Function Call Schema: Single-Property Schemas Fail to Execute

I’ve recently encountered an unusual behavior with OpenAI’s function call implementation when using structured outputs. Specifically, if a function schema only has a single required property, the API systematically fails with a 400 status code and an error like this:

{
  "error": {
    "message": "Invalid schema for function 'generate_simple_list': 'items' is not of type 'array'.",
    "type": "invalid_request_error",
    "param": "functions[0].parameters",
    "code": "invalid_function_parameters"
  }
}

However, if I add a second property to the schema—without changing anything else—the function works perfectly, and the API responds as expected.

What I Was Doing

I was trying to define a function schema for a simple structured output. Here’s an example of a single-property schema that fails:

{
  "name": "generate_simple_list",
  "parameters": {
    "type": "object",
    "properties": {
      "items": {
        "type": "array",
        "items": { "type": "string" }
      }
    },
    "required": ["items"]
  }
}

And here’s the two-property schema that works:

{
  "name": "generate_simple_list",
  "parameters": {
    "type": "object",
    "properties": {
      "items": {
        "type": "array",
        "items": { "type": "string" }
      },
      "count": {
        "type": "integer"
      }
    },
    "required": ["items", "count"]
  }
}

The only difference is the addition of the count property.

Reproducible Behavior

Here’s a simple example using R, but the same behavior should apply across other programming languages:

  1. Single-Property Schema:
  • The API fails with a 400 status code, returning an error indicating that the schema is invalid.
  1. Two-Property Schema:
  • The API processes the request successfully, calling the specified function and returning the expected output.

You can see the reproducible example in my code (included below).

library(httr)
library(jsonlite)

# Function to call the OpenAI API
hey_ChatGPT <- function(input, function_definitions = NULL) {
  body <- list(
    model = input$model,
    messages = input$messages
  )
  
  if (!is.null(function_definitions)) {
    body$functions <- function_definitions
  }
  
  body_json <- toJSON(body, auto_unbox = TRUE)
  response <- POST(
    url = "https://api.openai.com/v1/chat/completions",
    add_headers(Authorization = paste("Bearer", Sys.getenv("CR_API_KEY"))),
    content_type_json(),
    body = body_json
  )
  
  if (response$status_code == 200) {
    response_content <- content(response, as = "parsed", type = "application/json")
    return(response_content$choices[[1]]$message$function_call)
  } else {
    stop("API request failed with status code: ", response$status_code, "\n", content(response, "text"))
  }
}

# Single-property schema that fails
function_definitions_single <- list(
  list(
    name = "generate_simple_list",
    parameters = list(
      type = "object",
      properties = list(
        items = list(
          type = "array",
          items = list(type = "string")
        )
      ),
      required = c("items")
    )
  )
)

# Multi-property schema that works
function_definitions_multi <- list(
  list(
    name = "generate_simple_list",
    parameters = list(
      type = "object",
      properties = list(
        items = list(
          type = "array",
          items = list(type = "string")
        ),
        count = list(
          type = "integer"
        )
      ),
      required = c("items", "count")
    )
  )
)

# Input messages
input <- list(
  model = "gpt-4",
  messages = list(
    list(role = "system", content = "You are a helpful assistant."),
    list(role = "user", content = "Generate a list of three fruits.")
  )
)

# Test the single-property schema (expect 400 error)
tryCatch({
  result <- hey_ChatGPT(input, function_definitions_single)
  print(result)
}, error = function(e) {
  cat("Single-property schema failed:\n", e$message, "\n")
})

# Test the multi-property schema (expect success)
tryCatch({
  result <- hey_ChatGPT(input, function_definitions_multi)
  print(result)
}, error = function(e) {
  cat("Multi-property schema failed:\n", e$message, "\n")
})

Questions for the Community

  1. Is this a known or intentional behavior in OpenAI’s API implementation?
  2. If it’s intentional, are there guidelines for when a schema requires multiple properties to function correctly?
  3. Is this something OpenAI plans to document or address in future updates?

Any clarification or guidance from the community or OpenAI team would be greatly appreciated!

Hello @paul.connor,

Thank you for sharing your code.

It is possible that this issue is caused by a conflict with the Pydantic schema keyword “items,” which is used to define the items in an array.

To verify this, you can simply pass and rename the only property to something else, such as “list_items”.

Hi @sps

Thank you for reading and for your suggestion but I don’t think this is the issue. I tried renaming the property but still ran into the same error.

# Modified single-property schema with renamed property
function_definitions_single_renamed <- list(
  list(
    name = "generate_simple_list",
    parameters = list(
      type = "object",
      properties = list(
        list_items = list(  # Renamed from "items" to "list_items"
          type = "array",
          items = list(type = "string")
        )
      ),
      required = c("list_items")  # Updated required field
    )
  )
)

# Test the renamed single-property schema
tryCatch({
  result <- hey_ChatGPT(input, function_definitions_single_renamed)
  print(result)
}, error = function(e) {
  cat("Renamed single-property schema failed:\n", e$message, "\n")
})

I decided to test your original schema.

It resulted in the error:

openai.BadRequestError: Error code: 400 - {'error': {'message': "Missing required parameter: 'tools[0].type'.", 'type': 'invalid_request_error', 'param': 'tools[0].type', 'code': 'missing_required_parameter'}}

which reminded me that the JSON schema for a function should be defined with a

{ "type" : "function", 
  "function" : {...} # Definition goes inside {}
}

For structured outputs, we also need to add the strict: true parameter to your function definition. Additionally, all parameters must be included in the required array, and additionalProperties: false must be set.

Thus, the correct schema would be:

{
        "type": "function",
        "function": {
            "name": "generate_simple_list",
            "strict": True,
            "parameters": {
                "type": "object",
                "properties": {"items": {"type": "array", "items": {"type": "string"}}},
                "required": ["items"],
                "additionalProperties": False,
            },
        },
    }

You’ll also have to switch to gpt-4o to use structured outputs because:

Structured Outputs are available in our latest large language models, starting with GPT-4o:

  • o1-2024-12-17 and later
  • gpt-4o-mini-2024-07-18 and later
  • gpt-4o-2024-08-06 and later

Older models like gpt-4-turbo and earlier may use JSON mode instead.

Here's the code I used to test.
from openai import OpenAI


client = OpenAI()

tool_spec = [
    {
        "type": "function",
        "function": {
            "name": "generate_simple_list",
            "strict": True,
            "parameters": {
                "type": "object",
                "properties": {"items": {"type": "array", "items": {"type": "string"}}},
                "required": ["items"],
                "additionalProperties": False,
            },
        },
    }
]


conve = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Give me 3 fruits in a list.",
            }
        ],
    }
]
response = client.chat.completions.create(
    model="gpt-4o", messages=conve, tools=tool_spec
)

print(response)

Here’s its output:

ChatCompletion(id='chatcmpl-Aixx8b7OqAnBUV1kcnNSHxxxxxxxx', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_7yKPLgy1yuJvU5NNxxxxxxx', function=Function(arguments='{"items":["Apple","Banana","Cherry"]}', name='generate_simple_list'), type='function')]))], created=1735066602, model='gpt-4o-2024-08-06', object='chat.completion', service_tier=None, system_fingerprint='fp_xxxxxxxxxx', usage=CompletionUsage(completion_tokens=22, prompt_tokens=48, total_tokens=70, completion_tokens_details=CompletionTokensDetails(audio_tokens=0, reasoning_tokens=0, accepted_prediction_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))
1 Like

Excellent. It might not be obvious how to write a JSON schema function specification for an AI language model to an API that accepts an array - a list of unknown length, and of their own object type.

Here’s the canonical example that also gives the AI the purpose, so the function would be used automatically and appropriately.

{
  "name": "get_weather_data",
  "description": "Retrieves the current weather for a list of city names. Accepts a list of city names as input and returns a JSON object containing the weather information for each city.",
  "strict": true,
  "parameters": {
    "type": "object",
    "required": [
      "cities"
    ],
    "properties": {
      "cities": {
        "type": "array",
        "description": "Array of city names to retrieve weather data for",
        "items": {
          "type": "string",
          "description": "Name of a city"
        }
      }
    },
    "additionalProperties": false
  }
}

The AI then calling the get_weather_data function with:

{
  "cities": [
    "New York",
    "Los Angeles",
    "Chicago",
    "Houston",
    "Phoenix"
  ]
}