500 Enums Limitation in Structured Output

Hi everybody!

I recently ran into an interesting and undocumented (afaik) limitation of Structured Output feature - a limit of 500 Enum values per schema.

Error message is the following:

openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'MyTemplate': Expected at most 500 enum values in total within a single schema when using structured outputs, but received 520. Consider reducing the number of enums, or use 'strict: false' to opt out of structured outputs.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}

I wonder what is the constraining factor behind this limitation (e.g. context window size)? Why the number 500 was chosen? Are there plans to increase this limit?

Finally, I think it is worth to document this limitation in Structured Output docs as currently it is only discoverable by exceeding it.

Hope this helps others and the limit gets increased!

10 Likes

Did you try this? what problem can it cause?

  • Under the hood

First, we trained our newest model gpt-4o-2024-08-06 to understand complicated schemas and how best to produce outputs that match them. However, model behavior is inherently non-deterministic—despite this model’s performance improvements (93% on our benchmark), it still did not meet the reliability that developers need to build robust applications.

If you do not use strict, you only get the AI’s intelligence - and its ability to see 500+ items from a massive schema input that is placed for the AI’s understanding of how to write - and not simply write its own values.

The next step is…

In order to force valid outputs, we constrain our models to only tokens that would be valid according to the supplied schema, rather than all available tokens.

and that is including the tokens possibilities at each enum position in a schema.

As you can imagine, write a schema with a bunch of anyOf subschemas, a bunch of long strings, each having a long list of possibilities – you’ve got to have some kind of cutoff so that it doesn’t take 30 seconds of backend churn to even start the output. Max enum length is more understandable than “maximum proprietary backend object artifact construct”.

The quotes are from here for more reading: https://openai.com/index/introducing-structured-outputs-in-the-api/