Strict = true schema bug, max parameters

I am using a very long function for my API Assistant, and have run into what seems to be a bug:

When strict = false, the function saves and runs well.

However when strict = true, I get the following response when saving:

“Invalid schema for function ‘X’: 413 parameters exceeds limit of 100.”

The strict code is the only line I am changing, the following is the same for every parameter:
“additionalProperties”: false,
“required”: [x,x,x,x,x,]

I believe this to be a bug as there seems to be no reason why setting strict = true should force the limit to be set at 100. I may be mistaken though.

1 Like

Hi @wolsen !

I believe that when you set the strict schema, there is some more rigorous validation, including on the depth of your schema, and number of nesting levels. They actually specify this restriction of 100 object properties here.

2 Likes

Damn okay that is super unfortunate, do you have any suggestions? I put in my instructions to follow it strictly but it isn’t always listening.

Also, is it not strange that the strict = false allow for >100 parameters?

Maybe 4 functions, called in parallel?

My suggestion would be to try and refactor the schema (e.g. can you use enums for one object instead of defining multiple objects). Another approach is to sub-divide it into smaller sub-schemas, and do a call for each one - more expensive and slower of course.

So strict = false means that lot of validation checks are not done, one of them being the depth/nesting limit. Other checks are also not done. Setting strict = true basically means that you are conforming exactly to OpenAI’s interpretation of the JSON schema spec.

I’ll look into dividing the 413 into 5 separate function calls, run them parallel.

File Search only allows for text response formats, that could be another route, but I image the same schema limit issue.

What the goal of the assistant is to pull specific info from a long pdf and put them into a +40 tab excel file (strict template). Maybe there’s another workaround for that, sending through the Excel template along with the pdf and having the assistant fill out the template?

or…

Setting "strict": true implies that a schema artifact, scoped at the organization level, is constructed and cached based on the parsing of the response_format parameter on specific models. This schema artifact enforces a structured pattern on the logit generation process, which is then applied to the token sampling algorithm for future use. This prevents the output of sequences that do not comply with the schema. This feature has implications beyond just the language provided to the AI, including practical infrastructure considerations related to size and performance. Therefore, you will encounter particular new input schema level rejections during preprocessing, based on length, nesting, or other restrictions that have been placed.

I would go with the former - I have never seen it conform well to a spreadsheet or doc template. So defining 4-5 schemas that provide specific information, sending those along with the data/PDF should give you what you need.