Function calling seems to plan and reason pretty well compared to using the ReAct pattern. However, with ReAct, you could always try to address failed reasoning by tuning the prompt, as where with function calling it seems like you get what you get. Is this correct? In other words, if the model fails to call functions when and as desired, is there anything you can do in your prompting to overcome this? (I understand that the name and description you provide for the functions is something you can tune, but I am wonder if it’s worth trying to tune the actual prompt messages.)
EDIT: a better formulation of the question is whether it’s possible for the API to return both the content property (i.e. with reasoning steps) and the functional_call property simultaneously. As shown below by _i, this is in fact possible.
Totally, the prompt and the function names, descriptions all work together to let the model know your wishes, sometimes it’s easy to think the functions are some “other thing” but it is still text that the model is reading and trying to come up with a good reply to, it’s been fine tuned to return function headers when if thinks it’s appropriate but it’s still totally up to your prompts to let it knwo what and when it should do that.
Hmm, I actually didn’t ask the question quite right. The issue I see is that with ReAct the prompt coaxes the model to think in a particular way by incorporating the thought process in the completion. From what I can tell, however, this is not possible with function calling because the completion seems either to be content or a function call, but not both. Is that not the case?
I have gotten answers that are a combination of both, so it is supported.
Output (with python function capability), a response with both a user-readable answer and infocation of a function looks like
content: AI explanation of how code could be written to solve the problem, then AI “let’s see that in action”
function: python code to calculate and return answer
content: explanation of how AI would ask for a drawing of your picture
function: image generation API
You can give a prompt that is specifically designed to evoke this response, where the first output with thought processes can improve the quality of the function that is written:
first, explain the steps that are required to obtain the answer, and the tools that you can use to enhance your answer quality; then
to the best of your AI abilities, try to answer the question without the assistance of functions; then
after your explanation, actually do the task or calculation that was described.
This type of output being included is as unpredictable as the function-calling itself, and placement in system or user role may depend on your own experimentation and quality of answers and the user inputs you are taking, but the step-by-step should give a much higher chance of user output.
Ok, so given that it seems we can’t do anything in the prompt to stimulate CoT reasoning to help arrive at the correct action (ie function call), right? IOW, if we are successful in stimulating that reasoning, it means we will get back content (ie text) rather than a function call, which defeats the purpose of using the function calling feature in the first place. Or am I not thinking about this correctly?
You get back text no matter what, the difference is in the finish reason, one is “function” and the other is “stop” (plus others for error cases).
The best way to think about it is to exclude functions from your mind. What functions do is to use a tuned aspect of the model to create replies that are in JSON format. You can and could achieve the same thing in the old models if you asked for JSON as the reply format, the difference is that it used to be somewhat hit and miss, now it is more reliable in producing text that is JSON.
There has been no fundamental change in the “text in, text out” nature of the model. So I see no reason why CoT or ToT cannot be used, you just need to analyse the return information and treat it as a string.
Rather it seems we can. For those needing an example:
You are Answerbot, a large language model trained by OpenAI.
Knowledge cutoff: 2021-09
Current date: 2023-07-16
You will first output a full explanation of how you will attempt to answer the problem:
a natural language introduction and description summarizing the problem or question;
a step by step procedure detailing the steps or algorithm needed to solve the problem and the tools you will use;
Then, in the same response, finally use all tools available to get more information, calculate the answer, or do the operation requested.
Please sum the first 50 digits of PI, and count how many times the digit 3 appears.
"content": "To sum the first 50 digits of PI and count how many times the digit 3 appears, we can follow these steps:\n\n1. Retrieve the first 50 digits of PI.\n2. Convert the digits to a list of individual characters.\n3. Sum the digits.\n4. Count the occurrences of the digit 3.\n\nLet's calculate the sum and count using Python:",
"arguments": "import math\n\n# Step 1: Retrieve the first 50 digits of PI\npi_digits = str(math.pi)[:52] # We retrieve 52 digits to ensure we have at least 50 valid digits\n\n# Step 2: Convert the digits to a list of individual characters\ndigits_list = list(pi_digits)\n\n# Step 3: Sum the digits\ndigits_sum = sum(int(digit) for digit in digits_list)\n\n# Step 4: Count the occurrences of the digit 3\ndigit_3_count = digits_list.count('3')\n\ndigits_sum, digit_3_count"
Response to user, chain-of-thought
To sum the first 50 digits of PI and count how many times the digit 3 appears, we can follow these steps:
Retrieve the first 50 digits of PI.
Convert the digits to a list of individual characters.
Sum the digits.
Count the occurrences of the digit 3.
Let’s calculate the sum and count using Python:
Function call parameter (python sandbox code)
# Step 1: Retrieve the first 50 digits of PI
pi_digits = str(math.pi)[:52] # We retrieve 52 digits to ensure we have at least 50 valid digits
# Step 2: Convert the digits to a list of individual characters
digits_list = list(pi_digits)
# Step 3: Sum the digits
digits_sum = sum(int(digit) for digit in digits_list)
# Step 4: Count the occurrences of the digit 3
digit_3_count = digits_list.count('3')
The AI will naturally attempt to iterate over different types of function calls based on the return value of the first: different queries, retries for timeouts, different functions for different answers, rewrite code without error shown.
The difference I see is in what property of the response the result is returned. Namely, a JSON function call is returned in the function_call property as where a typical “text out” response is returned in the content property.
Now, here is the crucial point that that I should have made more clear in my initial question: I had previous thought that the content and function_call properties would not be both returned, and rather that if function_call were populated that content would be set as “null”.
However, I do see now that the API is happy to return both function_call and content if properly instructed.
Sure, one thing you can do is not use the API and instead construct the messages yourself with a json library and standard net sockets, this is a more complex way of doing it, but that method will let you send text and get text back in it’s raw format. If you need just the raw text for some reason, it is possible.
Can you please elaborate on this? I see how you can bypass, for example, the Python OpenAI API library by making a pure HTTP request into the model, but that can’t be what you’re referring to because you still get back the same result (i.e. a JSON formatted text with properties like content, function_call, etc.) How can you get anything more direct than this?