Can prompt design enhance model's planning/reasoning when using function calling?

Function calling seems to plan and reason pretty well compared to using the ReAct pattern. However, with ReAct, you could always try to address failed reasoning by tuning the prompt, as where with function calling it seems like you get what you get. Is this correct? In other words, if the model fails to call functions when and as desired, is there anything you can do in your prompting to overcome this? (I understand that the name and description you provide for the functions is something you can tune, but I am wonder if it’s worth trying to tune the actual prompt messages.)

EDIT: a better formulation of the question is whether it’s possible for the API to return both the content property (i.e. with reasoning steps) and the functional_call property simultaneously. As shown below by _i, this is in fact possible.

2 Likes

Totally, the prompt and the function names, descriptions all work together to let the model know your wishes, sometimes it’s easy to think the functions are some “other thing” but it is still text that the model is reading and trying to come up with a good reply to, it’s been fine tuned to return function headers when if thinks it’s appropriate but it’s still totally up to your prompts to let it knwo what and when it should do that.

Hmm, I actually didn’t ask the question quite right. The issue I see is that with ReAct the prompt coaxes the model to think in a particular way by incorporating the thought process in the completion. From what I can tell, however, this is not possible with function calling because the completion seems either to be content or a function call, but not both. Is that not the case?

1 Like

Correct, you are returned either the function call finish reason and then the payload is the function details OR you get the standard assistant response, not both.

I have gotten answers that are a combination of both, so it is supported.

Output (with python function capability), a response with both a user-readable answer and infocation of a function looks like

content: AI explanation of how code could be written to solve the problem, then AI “let’s see that in action”
function: python code to calculate and return answer

or

content: explanation of how AI would ask for a drawing of your picture
function: image generation API

You can give a prompt that is specifically designed to evoke this response, where the first output with thought processes can improve the quality of the function that is written:

  • first, explain the steps that are required to obtain the answer, and the tools that you can use to enhance your answer quality; then
  • to the best of your AI abilities, try to answer the question without the assistance of functions; then
  • after your explanation, actually do the task or calculation that was described.

This type of output being included is as unpredictable as the function-calling itself, and placement in system or user role may depend on your own experimentation and quality of answers and the user inputs you are taking, but the step-by-step should give a much higher chance of user output.

Ok, so given that it seems we can’t do anything in the prompt to stimulate CoT reasoning to help arrive at the correct action (ie function call), right? IOW, if we are successful in stimulating that reasoning, it means we will get back content (ie text) rather than a function call, which defeats the purpose of using the function calling feature in the first place. Or am I not thinking about this correctly?

You get back text no matter what, the difference is in the finish reason, one is “function” and the other is “stop” (plus others for error cases).

The best way to think about it is to exclude functions from your mind. What functions do is to use a tuned aspect of the model to create replies that are in JSON format. You can and could achieve the same thing in the old models if you asked for JSON as the reply format, the difference is that it used to be somewhat hit and miss, now it is more reliable in producing text that is JSON.

There has been no fundamental change in the “text in, text out” nature of the model. So I see no reason why CoT or ToT cannot be used, you just need to analyse the return information and treat it as a string.

Rather it seems we can. For those needing an example:

API Input

system

You are Answerbot, a large language model trained by OpenAI.
Knowledge cutoff: 2021-09
Current date: 2023-07-16

  1. You will first output a full explanation of how you will attempt to answer the problem:
  • a natural language introduction and description summarizing the problem or question;
  • a step by step procedure detailing the steps or algorithm needed to solve the problem and the tools you will use;
  1. Then, in the same response, finally use all tools available to get more information, calculate the answer, or do the operation requested.

user

Please sum the first 50 digits of PI, and count how many times the digit 3 appears.

API output

{
  "index": 0,
  "message": {
    "role": "assistant",
    "content": "To sum the first 50 digits of PI and count how many times the digit 3 appears, we can follow these steps:\n\n1. Retrieve the first 50 digits of PI.\n2. Convert the digits to a list of individual characters.\n3. Sum the digits.\n4. Count the occurrences of the digit 3.\n\nLet's calculate the sum and count using Python:",
    "function_call": {
      "name": "python",
      "arguments": "import math\n\n# Step 1: Retrieve the first 50 digits of PI\npi_digits = str(math.pi)[:52]  # We retrieve 52 digits to ensure we have at least 50 valid digits\n\n# Step 2: Convert the digits to a list of individual characters\ndigits_list = list(pi_digits)\n\n# Step 3: Sum the digits\ndigits_sum = sum(int(digit) for digit in digits_list)\n\n# Step 4: Count the occurrences of the digit 3\ndigit_3_count = digits_list.count('3')\n\ndigits_sum, digit_3_count"
    }
  },
  "finish_reason": "function_call"
}

Thus giving:

Response to user, chain-of-thought

To sum the first 50 digits of PI and count how many times the digit 3 appears, we can follow these steps:

  1. Retrieve the first 50 digits of PI.
  2. Convert the digits to a list of individual characters.
  3. Sum the digits.
  4. Count the occurrences of the digit 3.

Let’s calculate the sum and count using Python:

Function call parameter (python sandbox code)

import math

# Step 1: Retrieve the first 50 digits of PI
pi_digits = str(math.pi)[:52]  # We retrieve 52 digits to ensure we have at least 50 valid digits

# Step 2: Convert the digits to a list of individual characters
digits_list = list(pi_digits)

# Step 3: Sum the digits
digits_sum = sum(int(digit) for digit in digits_list)

# Step 4: Count the occurrences of the digit 3
digit_3_count = digits_list.count('3')

digits_sum, digit_3_count

The AI will naturally attempt to iterate over different types of function calls based on the return value of the first: different queries, retries for timeouts, different functions for different answers, rewrite code without error shown.

2 Likes

The difference I see is in what property of the response the result is returned. Namely, a JSON function call is returned in the function_call property as where a typical “text out” response is returned in the content property.

Now, here is the crucial point that that I should have made more clear in my initial question: I had previous thought that the content and function_call properties would not be both returned, and rather that if function_call were populated that content would be set as “null”.

However, I do see now that the API is happy to return both function_call and content if properly instructed.

Sure, one thing you can do is not use the API and instead construct the messages yourself with a json library and standard net sockets, this is a more complex way of doing it, but that method will let you send text and get text back in it’s raw format. If you need just the raw text for some reason, it is possible.

Can you please elaborate on this? I see how you can bypass, for example, the Python OpenAI API library by making a pure HTTP request into the model, but that can’t be what you’re referring to because you still get back the same result (i.e. a JSON formatted text with properties like content, function_call, etc.) How can you get anything more direct than this?

sure, if you look in my bio I have an ESP32 library, if you checkout the c++ code for that it implements a basic chat completions endpoint call and response handling, direct link here

It should give you and idea of what I mean.

To complete the spectrum of responses, I’d say that I do both in one function_output.

Let me show you.

class Output(BaseModel):
   s: Literal[MM, WW, NN, NONE] = Field(description=f"{p1s}")  # This is what I need
   rs: str = Field(description=f"Brief reasoning behind your choice of {p1s}") 

I normally do:

  1. Test a prompt with text_output.
  2. When refined, move it to function_output with reasoning (Output above is the function format)
  3. When refined, remove the reasoning output (just put a # before the rs)
  4. [update 2024-09-03]: I would not remove the reasoning as it does help to get better answers (and if the explanation is logic but away from what you wanted, to modify something else on the prompt). I would also put the reasoning before the solution.

Sometimes you have “a prompt” and the text_responses differ from the function_responses. That may be for many many reasons. Something to shed light into the problem is the reasoning within the function output. It is like a probe that let you see what is going on.

I arrived to this thread by searching
calling ai | mml model with function works better if you allow it to express reasoning

If yes, leaving “rs” there even if I won’t use it may turn out beneficial even if it has a cost.

[Update 2024-09-03]: Normally I do this.

  1. prompt > LLM > text_output
  2. text_output + function_call > LLM > object

The prompt contains at the end a light version of the desired format. For the case above I would use.

# Output format:
reasoning: str  # Brief reasoning behind your choice of solution
solution: str # One of ["MM", "WW", "NN", "None"]  <briefly explain what is it>

and then use this for the function_specification

class Output(BaseModel):
  reasoning: str
  solution: Literal[MM, WW, NN, NONE]

It works really well as it is very similar to Pydantic format.

1 Like

Perplexity gave me this

[…]

Additionally, prompting the model to provide explanations and reasoning for its actions can help improve the reliability and transparency of the function call process. This allows the developer to better understand the model’s decision-making process and make adjustments as needed.
1

3
In summary, when using OpenAI’s function call feature, it is important to:

  1. Allow the model to express its reasoning and provide explanations for its actions.
  2. Implement robust validation and error handling mechanisms to ensure the returned data is accurate and complete.
  3. Prompt the model to provide explanations and reasoning for its actions to improve transparency and reliability.

By following these best practices, developers can leverage the power of OpenAI’s function call feature to build more sophisticated and reliable AI-powered applications.
1

2

3

4

5