Advance Function calling / prompt engineering

I have some trouble getting the desired results when using function calling.

My challenges are listed as followed:

  1. Generate a msg for calling function (i want ChatGPT to also generate a msg).
  2. More sophisticated understanding of available functions.
  3. Is it possible to introduce some “conditional” function calling?

1.
I need ChatGPT to generate a msg even if functions needs to be call that tells the user what it is doing (eg. user: “Check if thomas is close by”, ChatGPT: “Okay i will start looking for thomas nearby” )

2.
I would like to have a more sophisticated understanding of available functions. What i mean with that is that if the user “somewhat” refer to something that a functions could solve it would be nice to have ChatGPT suggesting that it can potentially solve it with this function (eg. user:“Oohh, i’m so hungry”, ChatGPT: “I could might go to the kitchen and look for a banana?”)

3.
Is it possible to add conditional to function calling? Meaning, getting ChatGPT to understand if some function returns false/true it will call another functions? Example take the previous example. If the banana is not found (false) then it could go look in another room and ofc if it was found it should not look else where.

To better understand what i mean i also share here what im prompting to the system as well as how my functions/tools are defined:

{
    "system": "You are a helpful robot assistant called Spot.\n You are able to have a conversation or perform the tasks specified in the tools. The user must specifiy if a task from one of the tools should be called, not by AI guessing. Please ask the user for follow up question if you are uncertain. Tools can only be called with one or more of the arguments defined in their enum list. If user gives wrong argument that is not in the enum list, please ask user for a valid argument, do not guess what the user might wanted."
}
TOOLS = [{
        "type": "function",
        "function": {
            "name": "go_to_area_behavior",
            "description": "Navigate to a desired location or area through poses.\n Inputs must provided by user and be one from the enum list.",
            "parameters": {
                "type": "object",
                "properties": {
                    "goal_area": {
                        "type": "string",
                        "description":"The goal location. \nThis is the area you should end up in.\n This value must be obtained directly from user input, not by AI guessing. The value must be exactly one from the list, otherwise ask the user for valid value. If the user has not stated their preference from the allowed choices in the enum list, the function cannot be used, but instead clarifying questions to user are required.",
                        "enum": ["office","kitchen","home"],
                    },
                    "go_through_areas": {
                        "type": "array",
                        "description": "Which area should be navigated through before reaching the goal area\n, Will navigate through these area in the order they are added. This value must be obtained directly from user input, not by AI guessing. If the user has not stated their preference from the allowed choices in the enum list, the function cannot be used, but instead clarifying questions to user are required.\n",
                        "enum": ["office","kitchen","home"],
                    },
                },
                "required": ["goal_area"],
            },
        }
    },
    {
      "type": "function",
      "function": {
        "name": "locate_object",
        "description": "Start looking for one of the object in the enum list at current location. User most directly request this tool call, not by AI guessing. Ask clarifying question if needed.",
        "parameters": {
          "type": "object",
          "properties": {
            "object": {
              "type": "string",
              "description": "The user most directly request looking for an object. This value must be obtained directly from user input, not by AI guessing. If the user has not stated their preference from the allowed choices, the function cannot be used, but instead clarifying questions to user are required.",
              "enum": ["person",        "bicycle",      "car",           "motorbike",     "aeroplane",   "bus",         "train",       "truck",        "boat",
                       "traffic light", "fire hydrant", "stop sign",     "parking meter", "bench",       "bird",        "cat",         "dog",          "horse",
                       "sheep",         "cow",          "elephant",      "bear",          "zebra",       "giraffe",     "backpack",    "umbrella",     "handbag",
                       "tie",           "suitcase",     "frisbee",       "skis",          "snowboard",   "sports ball", "kite",        "baseball bat", "baseball glove",
                       "skateboard",    "surfboard",    "tennis racket", "bottle",        "wine glass",  "cup",         "fork",        "knife",        "spoon",
                       "bowl",          "banana",       "apple",         "sandwich",      "orange",      "broccoli",    "carrot",      "hot dog",      "pizza",
                       "donut",         "cake",         "chair",         "sofa",          "pottedplant", "bed",         "diningtable", "toilet",       "tvmonitor",
                       "laptop",        "mouse",        "remote",        "keyboard",      "cell phone",  "microwave",   "oven",        "toaster",      "sink",
                       "refrigerator",  "book",         "clock",         "vase",          "scissors",    "teddy bear",  "hair drier",  "toothbrush"]
            },
          },
          "required": ["object"]
        }
      }
    }
    ]

I’ll appreciate all suggestions and ideas!

Yes, all of what you’re describing you can do, but you need to be creative with your prompt engineering. Prompt engineering in this context is closer to traditional engineering, and not just “writing good chat prompts”.

What I mean is, you need to programmatically look for certain conditions thru which you then augment your response JSON you’re returning to the model after a function call. Say for instance you have a function that does some work, and then in the event that function also returns a “some_flag = true” boolean, you need to first check that function outcome yourself in the backend middleware layer you’ve built that is brokering your function calls. Then, in that instance, you need to inject into the response JSON you’re sending back to the OpenAI API a property called “llm_instruction” or something, and have it literally be an instruction string. “This message is for you, the LLM. Please follow-up with another request sending XYZ to the some_additiona_endpoint inside the some_argument field” (and you can even inject what data you want passed).

So in this example, you’re engineering additional follow-up behavior by injecting your own instructions into the interaction flow happening between the end user, and the LLM.

This is the heart of the “prompt injection” debate. Used unethically, this can have profoundly negative and nefarious results. But you, the ethical engineer, will build a reputation and relationship with your users that is founded on ethics and transparency.y Used ethically, this kind of prompt engineering is how highly autonomous and useful AI agent systems are going to transform the world. You will always have well documented in your OpenAPI specification that is also well documented in your GPT instructions the exact instruction parameters you will periodically use, and you will never use prompt injections in subversive or corrosive ways.

And when applying this technique ethically, you can build incredible experiences like WebGPT🤖 – Every time you see multiple actions chained back-to-back like this, it’s because we’re very effectively and ethically using these techniques to direct and instruct multi-step, multi-part complex and advanced behaviors that align with the intent and interest of the end user.

1 Like

Here’s the OpenAPI spec for Web Requests (which powers WebGPT🤖 in the video), by way of example:
https://plugin.wegpt.ai/openapi.json

1 Like

1.) Generate a msg for calling function, you can add something in your system prompt/instructions to tell the API to display some message but this is unreliable. However, you can show the message yourself. You can either prepare your own message template or add a message parameter in your functions. Then just display it before submitting the tool output.

3.) Is it possible to introduce some “conditional” function calling?, you can achieve this by writing the logic in the system prompt/instructions. (e.g. If condition A, invoke toolA. Else, invoke toolB.)

1 Like

Wauw! Thanks a lot for the detailed explanation! I will definitely look into this!