Gpt-4-turbo is not respecting "required", while gpt-4-turbo-preview is

Hello everyone, I’m attempting to upgrade some of our services from gpt-4/gpt-4-preview to “gpt-4-turbo” today and have run into an issue.

I can’t post the full prompt here, however it is a function call that returns a semi-complex series of nested objects that contains “required” for each of the properties.

This exact function call has a 100% success rate on gpt-4 and gpt-4-turbo-preview, however on gpt-4-turbo (currently pointing to gpt-4-turbo-2024-04-09) I get:

{
  "id": "...",
  "object": "chat.completion",
  "created": 1712770721,
  "model": "gpt-4-turbo-2024-04-09",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": { } // function is missing >90% of its body, containing only the first a (sometimes a,b) "required" for each section whenever there is a "required" [a, b, c, d, ...]
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 471,
    "completion_tokens": 85,    // very small
    "total_tokens": 556
  },
  "system_fingerprint": "..."
}

Thanks for reporting! Unfortunately function-calling is not guaranteed to follow the schema, and there can be differences in quality from model to model. My recommendation is to try prompt engineering the model e.g. “Always include the foo parameter”.

3 Likes

Understood, I’ll try that.

I worry this is a major regression however, we have made millions of queries over the last year on many different gpt-4 & gpt-4-turbo-preview versions with no issues.

This is the first time I’ve ever seen this kind of “ignore required” behavior.

Update: So far I’ve been unable to get any kind of prompt engineering to convince gpt-4-turbo to return all of the requested parameters as before. Oh well, I guess we’ll stick with -preview for now

1 Like

Also notable is the large amount of tokens generated, while you received none of them. The AI might have gone nutty with its multi_tool_use function wrapper writing, yet not emitting a function name that the API was willing to pass to you for diagnosis.

The lack of alternative clear text and logprobs of this return, and logit_bias that is unable to affect tool invocation, only shows that OpenAI distrusts developers to receive or send what actually powers the AI.

This is something I’ll investigate more of myself, to see if they are now blocking unspecified functions, like the threat of blocking your tool return inputs without tool ID the AI actually wrote and being accompanied by a forced assistant replay.

1 Like

Also, if you don’t need or want to handle parallel tool calls, you can “retool” to using functions, not the tool parameter specification.

Sorry, what do you mean exactly? I believe I am using functions.

our call contains a:

  • model
  • messages (system and user)
  • functionCall: [name: my_function]
  • functions: {
    {my_function_schema}
    }

The function schema is rather large, and every child element/property is “required”.

gpt-4-turbo absolutely calls the function in this case, giving me:

"function_call": {
          "name": "my_function",
          "arguments": <filtered>       // the response is only a small fraction of the total schema
        }

So I believe I am using “functions”? Or is there something I’m missing?

If you are using functions, then you are using functions! It is not absolutely clear just from the (lack of) response what the AI received as input.

Very important to note: only the lowest level of functions’ nesting receives either the “required” attribute for a property, or a description for the property. If you have an object inside the object, not only will the name of a property not be marked “optional” (which is the name simply marked with a question mark), but the description also will not be passed. The AI has to rely on the name alone to infer the purpose unless you give an extensive main function description. This knowledge might also make one assume you didn’t step into this minefield.

Unless the backend behavior of functions was modified in this model.

I just adapted a “code interpreter” script for the model and functions, and it is still writing Python that was never specified by API parameter.

Gotcha! Yeah I didn’t feel comfortable posting the full prompt/input here.

I’ve definitely noticed the “description inside child objects does not change the output” behavior before, back when we were first testing function calls in the initial public release. I figured it was a bug, and we compensated with more-descriptive names that seemed to do the trick.

That said, historically we’ve had perfect “schema conformance” prior to this model just released/gpt-4-turbo-2024-04-09.

I’ve sent our enterprise rep a 100% reproducible sample and they’ve opened an internal ticket. Hoping to hear back soon!

1 Like

Yes I have experienced firsthand that the new release gpt-4-turbo is big regression when it comest to function calling. gpt-4-turbo-preview is an absolute savant by comparison.

_j are you saying that if you use functions instead of tools that the parallel tool calling will not be included in the prompt on open ai side? In other words using functions instead of tools guarantees no calls to parallel_tool_calls?

I’ve been using a system prompt to prevent parallel_tool_calls but ofc it is not 100% successful as I am contradicting the openai system prompt

I am experiencing a slightly different issue where the gpt-4-turbo model completely rejects the required attribute and returns the following error:

OpenAIException - Error code: 400 - {'error': {'message': "Unknown parameter: 'tools[1].function.required'.", 'type': 'invalid_request_error', 'param': 'tools[1].function.required', 'code': 'unknown_parameter'}}

The same function descriptions work with gpt-4-1106-preview as well as when I remove the required attribute from the functions.

Found the issue. Looks like the required now needs to move into the parameters section and perhaps it was simply ignored before but now as a result of strict type checking it fails.

Can you post a before/after where you moved “required”?

I have something like the following:

              "my_object": {
                "type": 'array',
                "description": 'an array of child objects, sorted by X',
                "items": {
                  "type": 'object',
                  "properties": {
                    "child": {
                      "type": 'string',
                      "description": 'A single element that is about X'
                    }
                  },
                  "required": ['child']
                }
              },
1 Like

Yes, functions has no bloat wrapper injected for the AI to write its tools into, and the API has no mechanism for the IDs that match up multiple tool calls to the response the particular call returned.

It is also better followed, and just better.

I am getting the same error with function calling:

{'error': {'message': "Unknown parameter: 'tools[1].function.required'.", 'type': 'invalid_request_error', 'param': 'tools[1].function.required', 'code': 'unknown_parameter'}}

The required object IS in the in parameters (from what I remember, this was always the necessary structure).

For now I just completed removed the required object and added the required params into the function description itself (in other words it’s added into prompt).

Ok I played around with using functions instead of tools. With just a drop in replacement I found that the AI did not call the functions as readily as when using tools. Meaning that when using tools the AI would often call the tool at the correct time. Whereas with functions it sometimes was called and sometimes not.

I am not using much prompt engineering w.r.t functions/tools in the system prompt. I am just using some basic descriptions in the function schemas themselves.

Is it worth it to try and get functions working to the level of tools via additional prompting? The parallel tool calls do mess up my program sometime so I’d prefer to be rid of them.