Getting a function call + textual response in the same call

Tyler_Durden · July 27, 2024, 9:23pm

High Level Problem - Getting a text message in “Content” and the result of function calling isn’t possible in a single api call for some reason (example at the bottom of the body)

Use Case:
I am using open ai to navigate a website using tools like click/scroll/type

Present State:
My user prompt has the below:

The annotated element dictionary for the web page (each element is numbered) - I also provide this annotated image as well because I use gpt-4o
The schemas of my tools
The task I am trying to achieve (like go into the Help Section of the Website)

The functions param is left empty, so I am not really using function calling in it’s true sense

I am asking the llm to give me 2 things…

Thought (For example - I need to find the help section on this web page)
The function to be used (click/type) and it’s args

Why is thought important for me ?
I need to capture this thought in my chat history (rather it will go in the user prompt for the next iteration for llm to figure out the next steps after my functions do the first iteration steps given by the llm… The llm needs to be told what it thought in the previous instance so that it does not repeat the step/mistake again

In this present state, it’s failing some times and it makes me go in the true function calling direction…

I know how/where to attach the tool/function schemas for function calling…

So I can get the Part 2 of my required response (function and it’s args)
Getting the “Thought” itself seems difficult/impossible though

Now that you understood why “Thought” is important, I am asking how to get “Thought” in the Content part of my response and function+args in the function_call part of my response in the same llm call. I can’t afford to 2 calls due to the scale of operations.

I have already tried a very simple version dummy example like below and the concurrent content and function call I am asking for is not happening :
Create a get_current_weather function
Have the prompt as : - “What’s the weather like in Boston? Also tell a joke”

I am only getting the function call, but I was also expecting a joke

My dummy code that doesn’t achieve my ask

import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']

import json

def get_current_weather(location, unit="fahrenheit"):
    """Get the current weather in a given location"""
    weather_info = {
        "location": location,
        "temperature": "72",
        "unit": unit,
        "forecast": ["sunny", "windy"],
    }
    return json.dumps(weather_info)

# define a function
functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
            },
            "required": ["location"],
        },
    }
]


messages = [
    {
        "role": "user",
        "content": "What's the weather like in Boston? also tell me a joke"
    }
]

import openai
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo-0613",
    messages=messages,
    functions=functions
)

response_message = response["choices"][0]["message"]
print(response_message["content"]) #there will be no content at all, but i expected a joke
print(response_message["function_call"])  #correct function call

merefield · July 28, 2024, 6:01am

I don’t believe it is possible to handle both in one call.

You must keep a visible chat history and a inner thought history (for current response) you can send to the LLM together whilst only showing the chat to the user.

You loop inner discussion with the bot whilst accumulating function results untill you have a response you can send to the user once all function calls and answers have been processed by the LLM.

You can look at my algorithm here as an example:

github.com

merefield/discourse-chatbot/blob/c6b3f587c9092491052594c7cbef3d45c0de1f6b/lib/discourse_chatbot/bots/open_ai_bot_rag.rb#L162


      
            rescue => e
              if e.respond_to?(:response)
                status = e.response[:status]
                message = e.response[:body]["error"]["message"]
                Rails.logger.error("Chatbot: There was a problem with Chat Completion: status: #{status}, message: #{message}")
              end
              raise e
            end
          end
          
          def generate_response(opts)
            iteration = 1
            ::DiscourseChatbot.progress_debug_message <<~EOS
              ===============================
              # New Query
              -------------------------------
            EOS
            loop do
              ::DiscourseChatbot.progress_debug_message <<~EOS
                # Iteration: #{iteration}
                -------------------------------

rebaekahmike0474 · July 28, 2024, 4:44pm

Prompt the model to provide both the thought and function call separately:Parse the response to extract the thought and the function call.

jeanmarc.leroux · July 30, 2024, 7:29am

It’s by design IMO. The idea is that the LLM assumes it needs the results of one or more tools in order to generate its response.

There is a workaround: in the system prompt, instruct the LLM to explain what tools it will use and why before actually calling them.

It will then do exactly that and wait for the user confirmation before issuing the tool calls.

You might even instruct it to call them right away by adding a custom reserved word to the end of such messages, parse it and automatically reply with a confirmation message so it requires no user intervention.

You can also check Microsoft’s auotogen framework. It can be used for multi agent conversation, including mixing classic text response agents with tools only agents in the same conversation.

FYI I don’t know how Custom GPTs do it, but they can absolutely mix function calls and actual answers. It might use the trick above, or use a multi agent implémentation.

Beware of consequential function calls though. You might end up having the LLM unleashed on your tools and run a mock. It happened to me using a Custom GPT more than once. The LLM kept talking to itself for like 5 minutes straight, doing dozens of function calls without any user interaction.

Topic		Replies	Views
GPT-4 returning text-based prompt instead of returning the function call designated to do the same prompt API gpt-4	1	804	January 22, 2024
Executing multiple GPT functions in a single prompt API chatgpt , functions , function-calling	4	4458	December 14, 2023
Gpt3.5-turbo ignoring instructions for a function and calling it for all generic user requests API functions	5	2066	December 19, 2023
How a LLM-based application, integrates a custom function (API)? Community api	5	14906	July 9, 2024
How to make a function call and get a textual response at the same time? API	20	6119	November 19, 2023

Getting a function call + textual response in the same call

Related topics