How does function calling actually work for the Assistants API?

I have a really hard time understanding the documentation for the Assistants, API, what am I missing?
https://platform.openai.com/docs/assistants/tools/function-calling

assistant = client.beta.assistants.create(
  instructions="You are a weather bot. Use the provided functions to answer questions.",
  model="gpt-4-turbo-preview",
  tools=[{
      "type": "function",
    "function": {
      "name": "getCurrentWeather",
      "description": "Get the weather in location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string", "description": "The city and state e.g. San Francisco, CA"},
          "unit": {"type": "string", "enum": ["c", "f"]}
        },
        "required": ["location"]
      }
    }
  }, {
    "type": "function",
    "function": {
      "name": "getNickname",
      "description": "Get the nickname of a city",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string", "description": "The city and state e.g. San Francisco, CA"},
        },
        "required": ["location"]
      }
    } 
  }]
)
  1. I have built several custom gpts, which has access to API’s that I’ve built, I understand that this the schema for defining functions are similar to that of an openapi schema.

But what I dont understand is, is the schema definition provided in the example for actual python code or is it referring to an external api?

It doesn’t make sense in my head that this will refer to python functions for example, what am I missing here?

Where and what is the parameter for defining where said functions are stored?
Is the function definition for REST api calls?

Is it for python functions that I have created and stored on some server or locally? (That doesn’t make sense, I mean, how would it even have access to those functions to begin with if they aren’t exposed through an HTTP endpoint, SSH or some other network protocol, and if that were the case what parameter indicates how I would even expose the Assistant to said hypothetical functions?)

I mean yeah, okay, I get it that you are supposed to define the functions here? But okay, where do the functions come from ? How do I give the assistant access to functions that I have created?
Is this for defining instructions on how to interact with an api? And if so, WHERE IS THE PARAMETER FOR PUTTING THE ACTUAL HTTP endpoint?

This documentation raises to many questions.

OpenAi: Be better and make a clearer documentation

4 Likes

The Assistant, or any API model creates a universal mock function that you need to validate, clean & process.

I agree though. The documentation is lacking.

Okay…?

The Assistant, or any API model creates a mock function that you need to validate, clean & process.

What does this even mean?

I’ll make it as simple as possible:

  1. You provide functions
  2. When model determines a function should be called it writes up a mock-up of the function
  3. You need to call the function and return the results

It’s really simple. I don’t know why you are trying to be sassy about it. You are over complicating things by trying to relate it to GPTs which make the call for you.

I really don’t want to start to argue so please move this to GPT. Start here:

https://chat.openai.com/share/35b721e0-e378-478d-abdf-4b71af4c8232

The only part missed is that once you process the function you return the results to the model

Let say you have this schema:

const schema = {
    "openapi":"3.1.0",
    "info":{
      "title":"Wolfram",
      "version":"v0.1"
    },
    "servers":[
      {
        "url":"https://www.wolframalpha.com",
        "description":"Wolfram Server for ChatGPT"
      }
    ],
    "paths": {
      "/api/v1/cloud-plugin": {
        "get": {
          "operationId": "getWolframCloudResults",
          "externalDocs": "https://reference.wolfram.com/language/",
          "summary": "Evaluate Wolfram Language code",
          "responses": {
            "200": {
              "description": "The result of the Wolfram Language evaluation",
              "content": {
                "text/plain": {}
              }
            },
            "500": {
              "description": "Wolfram Cloud was unable to generate a result"
            },
            "400": {
              "description": "The request is missing the 'input' parameter"
            },
            "403": {
              "description": "Unauthorized"
            },
            "503":{
              "description":"Service temporarily unavailable. This may be the result of too many requests."
            }
          },
          "parameters": [
            {
              "name": "input",
              "in": "query",
              "description": "the input expression",
              "required": true,
              "schema": {
                "type": "string"
              }
            }
          ]
        }
      },
      "/api/v1/llm-api": {
        "get":{
          "operationId":"getWolframAlphaResults",
          "externalDocs":"https://products.wolframalpha.com/api",
          "summary":"Get Wolfram|Alpha results",
          "responses":{
            "200":{
              "description":"The result of the Wolfram|Alpha query",
              "content":{
                "text/plain":{
                }
              }
            },
            "400":{
              "description":"The request is missing the 'input' parameter"
            },
            "403":{
              "description":"Unauthorized"
            },
            "500":{
              "description":"Wolfram|Alpha was unable to generate a result"
            },
            "501":{
              "description":"Wolfram|Alpha was unable to generate a result"
            },
            "503":{
              "description":"Service temporarily unavailable. This may be the result of too many requests."
            }
          },
          "parameters":[
            {
              "name":"input",
              "in":"query",
              "description":"the input",
              "required":true,
              "schema":{
                "type":"string"
              }
            },
            {
              "name":"assumption",
              "in":"query",
              "description":"the assumption to use, passed back from a previous query with the same input.",
              "required":false,
              "explode":true,
              "style":"form",
              "schema":{
                "type":"array",
                "items":{
                  "type":"string"
                }
              }
            }
          ]
        }
      }
    }
 }

So you converted them to functions

{
      "name": "getWolframCloudResults",
      "description": "Evaluate Wolfram Language code",
      "parameters": {
        "type": "object",
        "properties": {
          "input": {
            "type": "string",
            "description": "the input expression"
          }
        },
        "required": [
          "input"
        ]
      }
    }

and

    {
      "name": "getWolframAlphaResults",
      "description": "Get Wolfram|Alpha results",
      "parameters": {
        "type": "object",
        "properties": {
          "input": {
            "type": "string",
            "description": "the input"
          },
          "assumption": {
            "type": "array",
            "description": "the assumption to use, passed back from a previous query with the same input.",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [
          "input",
          "assumption"
        ]
      }
    }

You attached them to your Assistants API.
Then when you are now polling your Run.

let completed = false

do {

const run = await openai.beta.threads.runs.retrieve(threadId, runId)

if(run.status === 'requires_action') {
      const tool_calls = run.required_action.submit_tool_outputs.tool_calls
      
      let tool_outputs = []

      tool_calls.forEach((tool) => {

                    const tool_args = JSON.parse(tool.function.arguments)
                    
                     let tool_output = { status: 'error', message: 'function not found'}

                    if(tool.function.name === 'getWolframCloudResults') {
                          // tool_args: { input: '...' }
                          const response = await fetch(`https://www.wolframalpha.com/api/v1/cloud-plugin?input=${tool_args.input}`)
                          tool_output = await response.json()
                    } else if(tool.function.name === 'getWolframAlphaResults') {
                          // tool_args: { input: '...' , assumtion: [...] }
                         const response = await fetch(`https://www.wolframalpha.com/api/v1/llm-api?input=${tool_args.input}&assumption=${tool_args.assumption}`)
                          tool_output = await response.json()
                    }

                    tool_outputs.push({
                        tool_call_id: tool.id,
                        output: JSON.stringify(tool_output)
                    })

    }
    
   // send back output
   await openai.beta.threads.runs.submitToolOutputs(
            threadId, 
            runId,
            {
                tool_outputs: tool_outputs,
            }

} else if(run.status === 'completed') {
      // handle completed
     completed = true
} else {
      // ...
}

await sleep(1000)

} while(!completed)

You still need to do the fetching to the external API yourself given the parameters from the functions.

3 Likes

Wasn’t trying to be sassy, genuinely confused.
Did find this article however, for anyone who might stumble upon this post in a google search:

It seems to explain some of the things that I am missing.
So when you are using the chat completion you can define utility functions (such as sqlite functions for querying a database, as is used as an example in the cookbook article). And we use the tools parameter to instruct the model to when and how to use these specific python functions, which is defined in the function schema definition. So in the part of the code where the gpt-4 model is used we call the function in the same file? (At least from what i can see in the cookbook example )
Hmm… Okay, I guess i’ll have to burn through some OpenAi credits before I get the hang of this…

Still a little bit confused

4 Likes

Ohhhh! Okay, I see now. This explains a lot, thanks

Ok. Sorry for the assumption.

You’re kind of on the right track. I believe the confusion comes from using GPTs first.

The functions are universal. It doesn’t need to be Python or anything. You are (usually) using the same code that manages your logic to also perform the functions.

The GPTs make the call on your behalf (because you aren’t managing them) so there’s a lot more going on.

You’re kind of on the right track. I believe the confusion comes from using GPTs first.

Yea most likely, the docs for the custom gpt is extremely simple and straightforward, it makes sense.
Well… it’s going to be fun to integrate over 30 different endpoints into this now. My openapi.yaml file is already 1500 lines long.

I wonder of OpenAi makes a lot of money from newbie junior devs like me, like, I have already payed like 20 dollars to openai now, and thats only when I did like 3-4 tests to see if my code worked properly.

1 Like

The good news is that with such a large amount of functions to call you can practice with GPTs. Ensure a consistent experience, move onto the API & you can fine-tune the function calling to reduce costs (as all of these functions are sent with each message).

The schema for function calling is much more distilled as well.

Good luck in your coding

I’ll explain it a bit better.

Assistants - Functions (tools)

Defining Functions in Assistants

  • When you create the definition and operation method of an assistant with a assistants.create API method or “assistants” base url call, you provide a tools specification, which can have a list of functions you created, and/or a list of built-in tools.
  • When you write the functions’ specification, it must adhere to a particular object schema, that has the name, and the parameters the AI can specify when sending.
  • Function specification is not received by the AI as it is written in json. The function is parsed, and only particular keys, such as “description” are included. That means examples, maximum values, etc, might not be rejected, but are never be seen. The text descriptions of exactly what the function is meant to achieve and return, and the parameter names are paramount to AI understanding.

Receiving Functions

  • When the AI has found that a user input could benefit from, or could be directly addressed by, a function specified by the developer, AI will send a message to the function name instead of sending a message to be read by a user (on chat completions AI may combine them, but this seems trained out).
  • You make continued polling requests for runs.retrieve, or a URL with the run ID. This gives you information about the progress of the run.
  • When when the run status is updated by the AI calling for function(s), instead of completed, you will receive requires_action. This indicates a function request is available.
  • You then parse that run object, which will now contain a "required_action" object, in tools format, with a list of all functions the AI wants to call in parallel. Parse them out.

Note: the parameters that are emitted originate directly from an AI-written JSON. The AI’s ability to understand the specification and create this is important.

Returning functions

  • fulfill the request with your own code. Make any external requests to APIs that you employ with your own internet.
  • the AI is encouraged to make use of multiple functions at once, expect this.
  • create the return object with all responses and matching ids to what was emitted.
  • send the tool object to the API with the runs.submit_tool_outputs.
  • the run continues if the form of what you sent is not rejected.
  • continue to poll for status.

I hope this tediously human-written explanation serves you. It overlaps the Documentation and API reference (which have many back-and-forth references between themselves required for understanding.)

When you have written your own tools, you then can also replace those that are built in. The tedium of a dozen assistants API methods then becomes preposterous compared to owning your own chat completions code you can supervise.

4 Likes

How do you fine-tune an assistant, specifically for function calling? Is it the same process as the completions fine-tuning?

Fine-tune models cannot be used in the Assistants endpoint.

Developer fine-tunes are more likely to damage the function training than to improve it, without devoting experienced labor to the development of training example sets (one discovers when they are used on the chat completions endpoint).

For improving the quality of tool-calling, you have the function specification itself, the name and the parameter names within, and the description text of parameters, to make the utility of the function and the expected return value extremely clear to the AI.

Anyone have an idea if there is a community sharing some simple functions? This thread did help me undersatnd the function calling better, but i feel many calls will be rather similar so just wondering if there was a community share.
I’m looking to create PPT from discussion, but having difficulty actually getting the function call to work

After using function calling for the last month, I wanted to provide my thoughts on how to effectively use it.

  • most of the time the LLM should be calling one or more APIs (which is disguised as a function). I normally wrap these in python requests functions which call a FastAPI endpoing. This approach is tedious BUT generates really good results. I personally use the Llama-Index wrapper on the OpenAI Agent, which has multi-turn single function calling. This allows the function calling to be self-organized by the LLM.

  • It does seem that you can provide an executable function, I don’t really know how or why this works. For example you can pass in a simple add or subtract python function and it will execute it on your behalf. I think its calling an eval(“python code”) on the local python kernel or using code interpreter to run the functions. The reason this doesn’t work well for my use cases is that I am usually performing Retrieval, which requires a dependency. The dependency doesn’t seem to work in the executable python function hence the main reason I wrap the tool in an API.

By adopting this belief, I was able to organize my project so that there is a many to many relationship between functions to be called an agents. You can also nest agents within function calling (think http request):

├── backend
│ ├── agents
│ │ ├── analyst
│ │ └── zendesk
│ │ ├── agent.py
│ │ ├── Dockerfile
│ │ ├── main.py
│ │ ├── requirements.txt
│ │ ├── router.py
│ │ └── utils.py
│ └── data
│ ├── data_dictionary
│ ├── zendesk
│ ├── Dockerfile
│ ├── main.py
│ └── requirements.txt
├── frontend
├── notebooks
├── pipeline
└── README.md

Another take…
You provide the LLM with the definition or blueprint of a function you have. The example they use is the get weather function that takes in a location parameter.
Once GPT determines it should call this function to generate a better response, it will reply to you (the thing/system sending/processing these Open AI API calls) with a function message that essentially tells your code - “Hey, run this function and return the results to me”.
Your code sees this message and knows (you programmed it) to call the function (remember, you told it you have these functions) and return the results via a special function message.

I think one misconception is that OpenAI/GPT is actually calling a function for you in their system.

In the case of the get weather example they use, they call a 3rd party weather API from the example code in a similar process as I described above.

1 Like

I hope this helps anyone looking for how to do polling for Assistants API and Threads API - a package @tmlc/openai-polling on npm and TinyMLCompany/openai-polling on GitHub