I am trying to implement function calling with gpt-3.5-turbo-1106. But the issue I am encountering is that the model sometimes hallucinates a function name, which is not even defined.
This ideally should do a conversation and if a user asks for their booking status, only then the function should be triggered. But the issue arises when a user asks a question which is not defined in the available functions.
Example: User: “Whats the date today?”
in this case even we don’t have a function defined as get_todays_date, but still the model attempts to call the function and thus encounters error.
Please excuse me if I have phrased something very naively, I am pretty new to this.
Try listing the available functions in the system prompt and how you want to invoke them and see if it will still try to call non-existing functions.
Add something like this in the system prompt:
...
# available tools
You have the following tools you can invoke depending on user request.
- get_booking_details: when the user asks for their booking status given the booking_id.
...
but I have a long list of functions to call, adding their description in the prompt will increase the prompt tokens significantly, is there a better way to achieve this?
@supershaneski thanks for your help, will try your suggestion as well.
just make your code a bit more robust to hallucinations, you can’t fix gpt, when it tries to call an unknown function, send an error message as output, and remind it the list of available functions with there description.
In python I also usually validate parameters with pydantic and just throw the pydantic validation error as function output.
Thanks for the suggestion, but I still am encountering issues with this. I am thinking of applying an intent classifier before I sent the message to openAI API. But, just want to make sure if there is a better way, and how people are dealing with the model hallucinations in case of crucial applications like function calling.
I had not encountered this problem, but its likely because I use gpt-4-1106-preview.
So, I changed the model in my code and asked it What’s today’s date. GPT-4-1106 answered the question without function calls but when I changed the model to GPT-3.5-turbo-1106 it goes crazy. It didn’t hallucinate the function name but hallucinated arguments for my functions and called all three of them.
in these cases, just validate the arguments, if they are valid do the call and return the result in the tool outputs, if they are not, just return an error as tool output. gpt-3.5-turbo wil see its not able to get time and will stop trying and tell you.
That’s what I’m doing, and honestly those hallucinations indeed happen but I don’t notice them as they are not “breaking” the conversation