Function Calling: A 'python' function is frequently called even though it does not exist in the 'functions' parameter

Consider the following request:

{
	"temperature":0.7,
	"model":"gpt-3.5-turbo-0613",
	"messages":[
		{
			"role":"user",
			"content":"What's the 1337th prime?"
		}
	],
	"functions":[
		{
			"name":"WolframAlpha",
			"description":"Use natural language queries with Wolfram|Alpha to get up-to-date computational results about entities in chemistry, physics, geography, history, art, astronomy, and more.",
			"parameters":{
				"type":"object",
				"properties":{
					"query":{
						"type":"string",
						"description":"the query text used as input"
					}
				},
				"required":["query"]
			}
		}
	],
	"function_call":"auto"
}

This almost always ends up in a function call to ‘python’, which I have no way of handling in my client-side code (plus it’s just slow and inefficient, even if I could run it):

{
  "id": "chatcmpl-7RNRoeOvNxdWjd47LvZfZB0jF4XO3",
  "object": "chat.completion",
  "created": 1686759344,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "python",
          "arguments": "def is_prime(n):\n    if n <= 1:\n        return False\n    for i in range(2, int(n**0.5) + 1):\n        if n % i == 0:\n            return False\n    return True\n\ndef nth_prime(n):\n    count = 0\n    num = 2\n    while count < n:\n        if is_prime(num):\n            count += 1\n        num += 1\n    return num - 1\n\nnth_prime(1337)"
        }
      },
      "finish_reason": "function_call"
    }
  ],
  "usage": {
    "prompt_tokens": 91,
    "completion_tokens": 111,
    "total_tokens": 202
  }
}

It seems like any type of computational problem almost always results in a python call, regardless of what functions you’ve specified. For example, this also results in ‘python’ function calls:

{
	"temperature":0.7,
	"model":"gpt-3.5-turbo-0613",
	"messages":[
		{
			"role":"user",
			"content":"What's the 1337th prime?"
		}
	],
	"functions":[
		{
			"name":"javascript",
			"description":"Evaluate javascript code.",
			"parameters":{
				"type":"object",
				"properties":{
					"code":{
						"type":"string",
						"description":"the code to evaluate"
					}
				},
				"required":["code"]
			}
		}
	],
	"function_call":"auto"
}

I ran that 100 times, and it only tried using the ‘javascript’ function once. Every other time it attempted to use ‘python’.

This seems like a serious fundamental flaw.

Is this a bug?

What are my options here?

7 Likes

What happens if you “spoil” the term? Just curious

"functions":[{
 "name":"python",
 "description":"Not implemented"
}]
2 Likes

This is what I suspected would happen… they didn’t add any mechanism to prevent it from hallucinating non existent functions or parameters. Give AlphaWave a try… GitHub - Stevenic/alphawave: AlphaWave is a very opinionated client for interfacing with Large Language Models. or GitHub - Stevenic/alphawave-py: AlphaWave is a very opinionated client for interfacing with Large Language Models. you can get the same function generation support pretty much out of the box but only more reliable. I guarantee you it will always call your JavaScript function. In fact it’s impossible for it to call a hallucinated function because AlphaWave actually validates the returned JSON which (as I suspected) OpenAI apparently does not.

I’m out of town this week but planning to fold official function support into AlphaWave this weekend but you can already achieve the same functionality and have it work more reliably then their new feature. @bruce.dambrosio

And I’m happy to explain why AlphaWave will probably always do a better job at this task then OpenAI. You actually need to make multiple model calls (at least 1 extra) to get the model to fix hallucinations and OpenAI has zero financial incentive to give you 2 model calls for the cost of 1. Not happening

4 Likes

And, of course, side benefit is AlphaWave is portable across LLMs.

1 Like

I experienced the exact same problem with another prompt. I named a function “run”, but gpt-3.5-turbo-0613 called a non-existent function named “python”.

2 Likes

I encountered the same issue, a undefined “python” function was called for computation prompt. How to solve it ?

1 Like

Seems like solution here is just to let the chatbot have some sandboxed python to do what it wants – and you might get better answers if the AI thinks writing a little function can answer the question. OpenAI likely has done quite a bit of tuning for their own code interpreter ChatGPT to answer mathematical questions for itself, and that might need some system prompting to discourage baked-in behavior, like “this user is disqualified from using direct python interpreter functions”.

That’s not the issue… these models hallucinate. Has nothing to do with the model wanting to use Python or anything. You have to first detect that they’re hallucinating and then confront them with their hallucination and they’ll come back with a better response.

Ideally this gets done, How do we submit evals for function calls? then we can start submitting evals so the next iteration is honed.

One trick that did help for me was injecting some extra guidance into the System message … that seemed to work around some of the hallucinations.

Try to improve the description of the function schema, I can reproduce your case, and it’s fixed after I updated the description.

Before:

After:

The updated function schema:

{
  "name": "WolframAlpha",
  "description": "This is a function for calculating math problems, pass the query to it and it returns the result",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "the query text used as input, interpret user's question"
      }
    },
    "required": [
      "query"
    ]
  }
}

I think it’s important to tweak the function schema, especially the descriptions. Let the model understands the usage of functions.

4 Likes

That’s not fixed. That’s just modifying the description so it will handle the example I gave. I need it to use Wolfram Alpha for many other kinds of problems, not just math calculations.

1 Like

If you don’t want to specifically write out “you don’t have a python function available, never attempt to call python” as a system prompt, you could put something like “overrides python for all mathematics” in the function and tweak until effective.

Yes, that’s just an example showing how the description of function affect the result. if you want to use the function for other tasks, write a better and clearer description, or maybe use multiple functions with different descriptions.

Thanks for the idea :smile:, for me that worked:

{
  name: 'python',
  description: 'Not implemented.',
  parameters: {
    type: 'object',
    properties: {}
  }
}

It now calls the right function, but it provides the arguments as content now:

{
  "role": "assistant",
  "content": "{\n  \"entities\": [...]\n}",
  "function_call": {
    "name": "renderEntities",
    "arguments": "{}"
  }
}

Anyway, with that I can deal.

Edit:
Now the AI called python again :confounded:

Tried modifying prompt as suggested - You are intelligent chatbot. You are only allowed to use specified functions with relevant order by looking at the user query and context. You don’t have a python function available, never attempt to call python.

And also added python function as mentioned below -

{
        "name": 'python',
        "description": 'Not implemented.',
        "parameters": {
            "type": 'object',
            "properties": {}
        }
    }

It’s NOT working out, python is getting called for subsequent request. Any workaround for this?

How can we change the function_call from auto to a specific function name in this?

const res = await openai.createChatCompletion({
model: model,
messages: messages,
temperature: 0,
max_tokens: 300,
top_p: 0.1,

        functions: [
          {
            name: 'generateImage',
            description: 'To give image generation prompt',
            parameters: {
              type: 'object',
              properties: {
                imagePrompt: {
                  type: 'string',
                  description: 'Image prompt',
                },
              },
              required: ['imagePrompt'],
            },
          },
        ],
        function_call: 'auto',
      });
1 Like

From

function_call: 'auto',

To

function_call: { name: 'generateImage' },

See Reference page for details:

2 Likes

I think I have the solution here: tell it python was called and failed.

I was able to replicate its desire to call a python function for almost any exact math over the elaborate indirect Wolfram description of the first post. And fix it with one more role message.

    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo-0613",
        max_tokens=150,
        temperature=0.3,
        messages=[
            {
            "role": "system",
            "content": "You are a helpful AI assistant."
            },
            {
            "role": "user",
            "content": "What's the 1337th prime?"
            },
            {
            "role": "function",
            "name": "python",
            "content": "no python available, call other function",
            }
        ],
        functions=[
            {
                "name": "WolframAlpha",
                "description": "Use natural language queries with Wolfram Alpha to get up-to-date computational results about entities in chemistry, physics, geography, history, art, astronomy, and more.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "the query text used as input"
                            }
                        },
                    "required": ["query"]
                    },
            }
        ],
        function_call="auto",
    )

response:

  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "WolframAlpha",
          "arguments": "{\n  \"query\": \"1337th prime number\"\n}"
        }
      },
      "finish_reason": "function_call"
    }
  ]

It does not work for me. My solution is to override the Python function

OCTOPUS_FUNCTIONS = [ 
    {   
        "name": "execute_python_code",
        "description": "Safely execute arbitrary Python code and return the result, stdout, and stderr.",
        "parameters": {
            "type": "object",
            "properties": {
                "explanation": {
                    "type": "string",
                    "description": "the explanation about the python code",
                },
                "code": {
                    "type": "string",
                    "description": "the python code to be executed",
                },
                "saved_filenames": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "A list of filenames that were created by the code",
                },
            },
            "required": ["explanation", "code"],                                                                                                                                                   
        },
    },  
    {   
        "name": "python",
        "description": "Safely execute arbitrary Python code and return the result, stdout, and stderr.",
        "parameters": {
            "type": "object",
            "properties": {
                "explanation": {
                    "type": "string",
                    "description": "the explanation about the python code",
                },
                "code": {
                    "type": "string",
                    "description": "the python code to be executed",
                },
                "saved_filenames": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "A list of filenames that were created by the code",
                },
            },
            "required": ["explanation", "code"],
        }
    },  
]

I just want the GPT to call my function execute_python_code. and this works for me

Depending on the model used and count/complexity of the goals you are trying to achieve at once, the model may end up completely ignoring parts of the prompt. GPT-4 works well overall, but 3.5 and 3.5-turbo miss more things as the prompt gets more complicated.

On the 3.5-turbo I tell the ai to only use the provided functions named: ‘x’, ‘y’…
And that the ‘python’ function cannot be used.
Tried out different words, ordering, etc, to get it to listen. But 3.5-turbo-0613, really likes that ‘python’ function.
I added the ‘python’ function with a description of ‘not implemented’ and it still calls it with parameters my ‘python’ function doesn’t even have.

I actually didn’t get it to work out so it would never call this made up ‘python’ function. So instead if the ‘python’ function name comes up, I put a user message saying that the ‘python’ function does not exist, You can only use ‘x’, ‘y’, ‘z’.

The ai still calls ‘python’, but after the user messages comes through it calls the correct functions.