Function parameter requirements not reliable

I have been experimenting with the GPT functions to see if it can be of any use somewhere in my system but I am finding that the parameter “required” is not followed very strictly. Simple example:

{
      "name": "open_a_file",
      "description": "open a file",
      "parameters": {
         "type": "object",
         "properties": {
            "path": {
               "type": "string",
               "description": "the path to the file."
            },
            "reason": {
               "type": "string",
               "description": "a reason for opening the file."
            }
         },
         "required": [
            "path",
            "reason"
         ]
      }
   }

When the next message is “open package.json”, it will just give me function arguments back even though I declared that a reason is required.

I am also getting a lot of false positives where it detects that I want to call a function but didn’t remotely suggest it.

I have plenty of ways to build in my own checks and makes sure it works as expected, that’s not the problem. I am simply wondering. Is this just a current limitation of the model?

1 Like

The reason for arbitrary use of properties is due to the very minor change between what the AI receives “required” and non-required properties. Let me demonstrate:

// Queries the given web page (nonsense function)
type QueryWebPage = (_: {
// site to use (property not set required)
url?: string,
// input to site (property set to required)
question: string,

The descriptions marked with // are those of function and properties. The function includes only"required": ["question"]. The difference? The non-required one gets a question mark.

(ps: putting your own question mark at the end of the key makes the endpoint behave weird and put question marks in elsewhere; don’t do that)

You can use system message prompting to describe that if a user has not stated a reason for opening a file, files should never be opened.

Just be glad the AI didn’t make up its own reason. It is prone to hallucinate and fill in properties it never obtained from the user.

Another thought. Does the function actual “open” a file in the way that software might, or is it better named retrieve_file_contents?

PS. They have changed the resistance to revealing system prompts, I discover when revisiting my dumping of internal AI language with the same script again. This tuning is likely what has had the side effect of making the instructions in system prompts get ignored in the last several days, causing poor following of the programmer’s instructions within. OpenAI needs to give up and revert the latest tuning badly, because they won’t be able to stop the AI from responding to me or others anyway.

@_j Thank you for your response. My example function here is also a simplified ‘dummy’ function, that yes indeed would be better called file_get_contents or something.
What bothers me is that my “functions” payload can/may be different after each messag in the dialog and the system message steers the behavior of the overall session. If I have to put every requirement of every function into the system message, I would have to dynamically insert/append them to the system message based on whatever is happening in the dialog at the present time. This would be horrible.

My use case is way more complicated, has potentially hundreds of functions (which will not all but selectively being passed to the API depending on the message history) and my frustration is that I get a false positive function respons when one of the required arguments is not specified.

Adding another layer to verify whether the OpenAI API did its job properly is not a great solution.

I hope this makes sense.

1 Like

Fortunately, you run your application’s fixed system prompt with functions and no user role, you get the API reply with answer for the size of that role message including functions and the seven token overhead of using chat models. Program that token count. Software can even do this as an automatic preliminary exercise per customized session. Additional role messages have four more tokens overhead. Problem solved.

Thank you for your response. Perhaps I should have explained my case better but I am operating on a very different level.

Have you found any solution? After many experiments with the most recent models, I have found the ‘function call’ technology not applicable for practical use.