Gpt-4-turbo is not respecting "required", while gpt-4-turbo-preview is

eric35 · April 10, 2024, 5:46pm

Hello everyone, I’m attempting to upgrade some of our services from gpt-4/gpt-4-preview to “gpt-4-turbo” today and have run into an issue.

I can’t post the full prompt here, however it is a function call that returns a semi-complex series of nested objects that contains “required” for each of the properties.

This exact function call has a 100% success rate on gpt-4 and gpt-4-turbo-preview, however on gpt-4-turbo (currently pointing to gpt-4-turbo-2024-04-09) I get:

{
  "id": "...",
  "object": "chat.completion",
  "created": 1712770721,
  "model": "gpt-4-turbo-2024-04-09",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": { } // function is missing >90% of its body, containing only the first a (sometimes a,b) "required" for each section whenever there is a "required" [a, b, c, d, ...]
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 471,
    "completion_tokens": 85,    // very small
    "total_tokens": 556
  },
  "system_fingerprint": "..."
}

atty-openai · April 10, 2024, 6:19pm

Thanks for reporting! Unfortunately function-calling is not guaranteed to follow the schema, and there can be differences in quality from model to model. My recommendation is to try prompt engineering the model e.g. “Always include the foo parameter”.

eric35 · April 10, 2024, 6:26pm

Understood, I’ll try that.

I worry this is a major regression however, we have made millions of queries over the last year on many different gpt-4 & gpt-4-turbo-preview versions with no issues.

This is the first time I’ve ever seen this kind of “ignore required” behavior.

Update: So far I’ve been unable to get any kind of prompt engineering to convince gpt-4-turbo to return all of the requested parameters as before. Oh well, I guess we’ll stick with -preview for now

_j · April 10, 2024, 7:01pm

Also notable is the large amount of tokens generated, while you received none of them. The AI might have gone nutty with its multi_tool_use function wrapper writing, yet not emitting a function name that the API was willing to pass to you for diagnosis.

The lack of alternative clear text and logprobs of this return, and logit_bias that is unable to affect tool invocation, only shows that OpenAI distrusts developers to receive or send what actually powers the AI.

This is something I’ll investigate more of myself, to see if they are now blocking unspecified functions, like the threat of blocking your tool return inputs without tool ID the AI actually wrote and being accompanied by a forced assistant replay.

_j · April 10, 2024, 7:19pm

Also, if you don’t need or want to handle parallel tool calls, you can “retool” to using functions, not the tool parameter specification.

eric35 · April 10, 2024, 7:36pm

Sorry, what do you mean exactly? I believe I am using functions.

our call contains a:

model
messages (system and user)
functionCall: [name: my_function]
functions: {
{my_function_schema}
}

The function schema is rather large, and every child element/property is “required”.

gpt-4-turbo absolutely calls the function in this case, giving me:

"function_call": {
          "name": "my_function",
          "arguments": <filtered>       // the response is only a small fraction of the total schema
        }

So I believe I am using “functions”? Or is there something I’m missing?

_j · April 10, 2024, 8:17pm

If you are using functions, then you are using functions! It is not absolutely clear just from the (lack of) response what the AI received as input.

Very important to note: only the lowest level of functions’ nesting receives either the “required” attribute for a property, or a description for the property. If you have an object inside the object, not only will the name of a property not be marked “optional” (which is the name simply marked with a question mark), but the description also will not be passed. The AI has to rely on the name alone to infer the purpose unless you give an extensive main function description. This knowledge might also make one assume you didn’t step into this minefield.

Unless the backend behavior of functions was modified in this model.

I just adapted a “code interpreter” script for the model and functions, and it is still writing Python that was never specified by API parameter.

eric35 · April 10, 2024, 8:23pm

Gotcha! Yeah I didn’t feel comfortable posting the full prompt/input here.

I’ve definitely noticed the “description inside child objects does not change the output” behavior before, back when we were first testing function calls in the initial public release. I figured it was a bug, and we compensated with more-descriptive names that seemed to do the trick.

That said, historically we’ve had perfect “schema conformance” prior to this model just released/gpt-4-turbo-2024-04-09.

I’ve sent our enterprise rep a 100% reproducible sample and they’ve opened an internal ticket. Hoping to hear back soon!

jansen · April 11, 2024, 8:58pm

Yes I have experienced firsthand that the new release gpt-4-turbo is big regression when it comest to function calling. gpt-4-turbo-preview is an absolute savant by comparison.

_j are you saying that if you use functions instead of tools that the parallel tool calling will not be included in the prompt on open ai side? In other words using functions instead of tools guarantees no calls to parallel_tool_calls?

I’ve been using a system prompt to prevent parallel_tool_calls but ofc it is not 100% successful as I am contradicting the openai system prompt

ash27 · April 12, 2024, 7:44pm

I am experiencing a slightly different issue where the gpt-4-turbo model completely rejects the required attribute and returns the following error:

OpenAIException - Error code: 400 - {'error': {'message': "Unknown parameter: 'tools[1].function.required'.", 'type': 'invalid_request_error', 'param': 'tools[1].function.required', 'code': 'unknown_parameter'}}

The same function descriptions work with gpt-4-1106-preview as well as when I remove the required attribute from the functions.

ash27 · April 12, 2024, 7:50pm

Found the issue. Looks like the required now needs to move into the parameters section and perhaps it was simply ignored before but now as a result of strict type checking it fails.

eric35 · April 12, 2024, 8:16pm

Can you post a before/after where you moved “required”?

I have something like the following:

              "my_object": {
                "type": 'array',
                "description": 'an array of child objects, sorted by X',
                "items": {
                  "type": 'object',
                  "properties": {
                    "child": {
                      "type": 'string',
                      "description": 'A single element that is about X'
                    }
                  },
                  "required": ['child']
                }
              },

_j · April 12, 2024, 9:22pm

Yes, functions has no bloat wrapper injected for the AI to write its tools into, and the API has no mechanism for the IDs that match up multiple tool calls to the response the particular call returned.

It is also better followed, and just better.

robmartin · April 13, 2024, 7:25pm

I am getting the same error with function calling:

{'error': {'message': "Unknown parameter: 'tools[1].function.required'.", 'type': 'invalid_request_error', 'param': 'tools[1].function.required', 'code': 'unknown_parameter'}}

The required object IS in the in parameters (from what I remember, this was always the necessary structure).

For now I just completed removed the required object and added the required params into the function description itself (in other words it’s added into prompt).

jansen · April 17, 2024, 10:31pm

Ok I played around with using functions instead of tools. With just a drop in replacement I found that the AI did not call the functions as readily as when using tools. Meaning that when using tools the AI would often call the tool at the correct time. Whereas with functions it sometimes was called and sometimes not.

I am not using much prompt engineering w.r.t functions/tools in the system prompt. I am just using some basic descriptions in the function schemas themselves.

Is it worth it to try and get functions working to the level of tools via additional prompting? The parallel tool calls do mess up my program sometime so I’d prefer to be rid of them.

Topic		Replies	Views
Gpt-4o vs. gpt-4-turbo - function calling Bugs function-calling , gpt-4-turbo , gpt-4o	17	5923	September 23, 2024
Function parameter requirements not reliable API function-calling	6	2597	May 27, 2024
New models are incapable of proper function calling Feedback	22	5604	July 17, 2024
Function call returns invalid JSON format Bugs gpt-35-turbo , bug , functions , function-calling , gpt-35-turbo-1106	7	1695	April 18, 2024
Gpt-3.5-turbo-1106 model consistently responds with unnecessary and inappropriate function calls [confirmed BUG JAN 26] Bugs api , tools	9	2438	April 4, 2024

Gpt-4-turbo is not respecting "required", while gpt-4-turbo-preview is

Related topics