On Function Calling, Can i pass a object to a funtion?

class Car:
    def __init__(self, name, model):
        self.name = name
        self.model = model

    def makesound(self):
        return print(f"Car {self.name} {self.model} makes sound")

    def __str__(self):
        return f"{self.name} {self.model}"

    def to_dict(self):
        return {"name": self.name, "model": self.model}


def make_car(name, model):
    car = Car(name, model)
    return car

def make_sound(car : Car):
    return car + ' makes sound pwweeee'

main.py

def run_conversation(user_prompt):
    messages = [
        {
            "role": "system",
            "content": """
                You are a function calling LLM that can call make_car funtion to make a car with given name and model and it return and car object.
            """,
        },
        {
            "role": "user",
            "content": user_prompt,
        },
    ]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "make_car",
                "description": "Make a car object with the given name and model",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "name": {
                            "type": "string",
                            "description": "The name of the car (eg: 'Volkswagen')",
                        },
                        "model": {
                            "type": "string",
                            "description": "The model of the car (eg: 'Vento', 'Polo', etc.)",
                        },
                    },
                    "required": ["name", "model"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "make_sound",
                "description": "Make the car sound",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "car": {
                            "type": "object",
                            "description": "The car object to make sound (eg: car object will be returned by make_car function)",
                        },
                    },
                    "required": ["car"],
                },
            },
        },
    ]

    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        tools=tools,
        tool_choice="auto",
        max_tokens=4096,
    )

    response_message = response.choices[0].message
    tool_calls = response_message.tool_calls
    if tool_calls:
        available_functions = {
            "make_car": make_car,
            "make_sound": make_sound,
        }
        messages.append(response_message)
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = json.loads(tool_call.function.arguments)
            if function_name == "make_car":
                function_response = function_to_call(
                    name=function_args.get("name"),
                    model=function_args.get("model"),
                )
            elif function_name == "make_sound":
                function_response = function_to_call(
                    car=function_args.get("car"),
                )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response
                }
            )
        second_response = client.chat.completions.create(
            model=MODEL, messages=messages
        )
        return second_response.choices[0].message


user_prompt = "Make a car with name 'Volkswagen' and model any model also make sound"
print(run_conversation(user_prompt))

is there any way to pass a car object to the make_sound funtion?

The purpose of functions should be obvious to the AI. It should tell the AI what the function is for, what action it would take, and what the result would be. The AI will invoke the tool’s function when it would satisfy or improve the answering of the user’s input.

“make an object” or “make a car” doesn’t really make sense. Is this AI running a car factory?

You don’t show how you imagine a “function to call()” function would work.

It is unclear if this is a coding question or a question about interacting with the AI.

1 Like

Passing an object means passing a pointer to an instance of an object.
The LLM works with text-strings and not with memory addresses.
It may be an idea to provide the model with a string describing the car object and let the model add this to the function call. In the sense of ‘take the contents of car.txt and use it as a parameter when invoking the function’.

1 Like

oh…So you are saying that there is no way for that? :melting_face:

I’m curious about a particular aspect of the repository in github kolbytn/mindcraft. It involves automating Minecraft with AI agents. Within this repository, how does the developer pass an AI bot object to the AI in order to call functions on the bot? I’d appreciate any insights or explanations regarding this process.

I’m curious about a particular aspect of the repository in github kolbytn/mindcraft. It involves automating Minecraft with AI agents. Within this repository, how does the developer pass an AI bot object to the AI in order to call functions on the bot? I’d appreciate any insights or explanations regarding this process.

I think we need to leave the phrase “object” behind here.

Objects?

In OO programming, an object is a single entity instance that can have its own properties and methods. For example, my chatbot has message objects which are instances of a message class, where it has the primitives that store values such as the text or number of tokens, objects such as the delete button and its icon, and methods such as the “change role” or “move” which operate on itself or its container.

AI is not objects

In interacting with AI models, the AI simply processes natural language. You send it “hello” and it sends “Hi there!” back.

You send it a message “text enclosed in %%% will generate an AI image”, and it will communicate differently when it wants to make an image for you.

Functions

OpenAI added functions as a feature and trained the AI on them. They are made of two types of structured data interfaces on the API:

  • a JSON schema specification to tell the AI what tool you have for it (which is rewritten into more natural language, and

  • a JSON output (that the AI writes) to use the tool.

The AI is trained then that if you offer it a tool function “search_the_database” with a parameter “search_terms” (also with further description of what that database contains and why it is useful), instead of responding to a user, it will write the special container that will give you a tool_call. You must then fulfill the tool’s purpose in your chatbot code, and return back what you promised the function would do.


Your car example and honking the horn function gives no purpose or reason. Your system message doesn’t tell the AI what it specializes in either.

Let’s give an example of something that interacts with the world, and I will describe it to you just like the AI needs it described.

robot_actions function:

Description: ChatBot has a robot head on an android to show its emotions. Before you write a reply to the user, you can express an emotion on the robot face the user can see, which will provide a more immersive interaction with ChatBot. The robot face will return to neutral expression after the reply to the user has finally been spoken.

Properties:

mouth_expression: string enum, from [smile | frown | neutral | grin | pout | gape | smirk | snarl]
eye_expression: string enum, from [wide | narrow | normal | squint | wink | roll | blink | gleam]

That’s just metacode. I use AI to give a schema for code:


functions = [
    {
        "name": "robot_actions",
        "description": "Allows the ChatBot to show emotions on its robot face for immersive interaction.",
        "parameters": {
            "type": "object",
            "properties": {
                "mouth_expression": {
                    "type": "string",
                    "enum": ["smile", "frown", "neutral", "grin", "pout", "gape", "smirk", "snarl"],
                    "description": "Controls the mouth expression of the robot to show different emotions."
                },
                "eye_expression": {
                    "type": "string",
                    "enum": ["wide", "narrow", "normal", "squint", "wink", "roll", "blink", "gleam"],
                    "description": "Controls the eye expression of the robot to show different emotions."
                }
            },
            "required": ["mouth_expression", "eye_expression"]
        }
    }
]

Then because I’m tired of writing up what’s already documented, let’s just see if ChatGPT can understand what I wrote to you so far, also pasting how to use tools.

To demonstrate how the AI might generate a response using the function specification for robot_actions and incorporate it into the given JSON format for output, let’s prepare a clever and insightful answer. The AI will first set the appropriate facial expressions based on the context or sentiment of the response it plans to give. Here’s an illustrative example where the AI chooses to display a thoughtful expression using a slight smile (smirk) and concentrated eyes (squint).

Example JSON Response by the AI

{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "robot_actions",
          "arguments": "{\n  \"mouth_expression\": \"smirk\",\n  \"eye_expression\": \"squint\"\n}"
        }
      },
      "finish_reason": "function_call"
    },
    {
      "index": 1,
      "message": {
        "role": "assistant",
        "content": "Indeed, the complexity of emotions we can express, even with simple changes in our facial expressions, reflects the deep interconnectedness of human feelings and social communication.",
        "function_call": null
      },
      "finish_reason": "completed"
    }
  ]
}

Detailed Explanation of the Response:

  1. First Message (Function Call):

    • Role: "assistant" - The actor in the conversation (the AI).
    • Content: null - There is no text content in this message because the purpose of this message is solely to update the robot’s facial expressions.
    • Function Call:
      • Name: "robot_actions" - Refers to the function that controls the facial expressions of the robot.
      • Arguments:
        • "mouth_expression": "smirk" - The AI sets the mouth to a smirk, perhaps to indicate a nuanced or mildly amused response.
        • "eye_expression": "squint" - The eyes are set to squint to complement the smirk, showing concentration or a thoughtful stance.
    • Finish Reason: "function_call" - Indicates that this message’s purpose was to execute a function call (i.e., to update the facial expressions).
  2. Second Message (Insightful Answer):

    • Role: "assistant"
    • Content: A carefully crafted statement reflecting on emotions and their expression through facial cues. This content follows the expression set by the previous function call, tying the visual and textual responses together coherently.
    • Function Call: null - No function is called in this message as it is purely textual.
    • Finish Reason: "completed" - Signifies that the assistant has finished delivering its response following the update to its facial expressions.

This JSON response example showcases how an AI can use both visual cues (facial expressions) and textual information to provide a rich, multi-modal interaction experience. The thoughtful combination of expressions and insightful commentary can make interactions with the AI more engaging and meaningful for users.

In actuality, you must have code that interacts with the robot face. Then you actually have to return the value before the AI will write to the user. The return provides a tool return value with natural language within, such as saying “success: the robot face is now smirk and squint”

1 Like

here’s a possible way.

  1. when you invoke the make_car function in your chatbot, it will call a similar command in the 3d simulator and returns an id which you’ll use to be able to call functions for that object in the 3d simulator. use this id as your tool_output.
  2. now, when you invoke your make_sound, considering that the car id is in the context, you’ll pass the id to the make sound function in the 3d simulator and that particular object will make the sound.
  3. even if you have created several cars, as long as you have the ids, you can tell your bot to invoke make sound for any of those car objects. even if you don’t mention the id at all, as long as it is in the context, you can use first car, second car, etc. or if you return some other description with the id like manufacturer, model or color, you can invoke make_sound using those and it will probably pick the correct car id.
1 Like

Thank you so much. Your support means a lot to me and I truly appreciate your efforts.