How does function calling actually work for the Assistants API?

holistic.engine · February 19, 2024, 11:27pm

I have a really hard time understanding the documentation for the Assistants, API, what am I missing?
https://platform.openai.com/docs/assistants/tools/function-calling

assistant = client.beta.assistants.create(
  instructions="You are a weather bot. Use the provided functions to answer questions.",
  model="gpt-4-turbo-preview",
  tools=[{
      "type": "function",
    "function": {
      "name": "getCurrentWeather",
      "description": "Get the weather in location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string", "description": "The city and state e.g. San Francisco, CA"},
          "unit": {"type": "string", "enum": ["c", "f"]}
        },
        "required": ["location"]
      }
    }
  }, {
    "type": "function",
    "function": {
      "name": "getNickname",
      "description": "Get the nickname of a city",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string", "description": "The city and state e.g. San Francisco, CA"},
        },
        "required": ["location"]
      }
    } 
  }]
)

I have built several custom gpts, which has access to API’s that I’ve built, I understand that this the schema for defining functions are similar to that of an openapi schema.

But what I dont understand is, is the schema definition provided in the example for actual python code or is it referring to an external api?

It doesn’t make sense in my head that this will refer to python functions for example, what am I missing here?

Where and what is the parameter for defining where said functions are stored?
Is the function definition for REST api calls?

Is it for python functions that I have created and stored on some server or locally? (That doesn’t make sense, I mean, how would it even have access to those functions to begin with if they aren’t exposed through an HTTP endpoint, SSH or some other network protocol, and if that were the case what parameter indicates how I would even expose the Assistant to said hypothetical functions?)

I mean yeah, okay, I get it that you are supposed to define the functions here? But okay, where do the functions come from ? How do I give the assistant access to functions that I have created?
Is this for defining instructions on how to interact with an api? And if so, WHERE IS THE PARAMETER FOR PUTTING THE ACTUAL HTTP endpoint?

This documentation raises to many questions.

OpenAi: Be better and make a clearer documentation

anon10827405 · February 19, 2024, 11:34pm

The Assistant, or any API model creates a universal mock function that you need to validate, clean & process.

I agree though. The documentation is lacking.

holistic.engine · February 19, 2024, 11:36pm

Okay…?

The Assistant, or any API model creates a mock function that you need to validate, clean & process.

What does this even mean?

anon10827405 · February 19, 2024, 11:37pm

I’ll make it as simple as possible:

You provide functions
When model determines a function should be called it writes up a mock-up of the function
You need to call the function and return the results

It’s really simple. I don’t know why you are trying to be sassy about it. You are over complicating things by trying to relate it to GPTs which make the call for you.

I really don’t want to start to argue so please move this to GPT. Start here:

https://chat.openai.com/share/35b721e0-e378-478d-abdf-4b71af4c8232

The only part missed is that once you process the function you return the results to the model

supershaneski · February 19, 2024, 11:51pm

Let say you have this schema:

const schema = {
    "openapi":"3.1.0",
    "info":{
      "title":"Wolfram",
      "version":"v0.1"
    },
    "servers":[
      {
        "url":"https://www.wolframalpha.com",
        "description":"Wolfram Server for ChatGPT"
      }
    ],
    "paths": {
      "/api/v1/cloud-plugin": {
        "get": {
          "operationId": "getWolframCloudResults",
          "externalDocs": "https://reference.wolfram.com/language/",
          "summary": "Evaluate Wolfram Language code",
          "responses": {
            "200": {
              "description": "The result of the Wolfram Language evaluation",
              "content": {
                "text/plain": {}
              }
            },
            "500": {
              "description": "Wolfram Cloud was unable to generate a result"
            },
            "400": {
              "description": "The request is missing the 'input' parameter"
            },
            "403": {
              "description": "Unauthorized"
            },
            "503":{
              "description":"Service temporarily unavailable. This may be the result of too many requests."
            }
          },
          "parameters": [
            {
              "name": "input",
              "in": "query",
              "description": "the input expression",
              "required": true,
              "schema": {
                "type": "string"
              }
            }
          ]
        }
      },
      "/api/v1/llm-api": {
        "get":{
          "operationId":"getWolframAlphaResults",
          "externalDocs":"https://products.wolframalpha.com/api",
          "summary":"Get Wolfram|Alpha results",
          "responses":{
            "200":{
              "description":"The result of the Wolfram|Alpha query",
              "content":{
                "text/plain":{
                }
              }
            },
            "400":{
              "description":"The request is missing the 'input' parameter"
            },
            "403":{
              "description":"Unauthorized"
            },
            "500":{
              "description":"Wolfram|Alpha was unable to generate a result"
            },
            "501":{
              "description":"Wolfram|Alpha was unable to generate a result"
            },
            "503":{
              "description":"Service temporarily unavailable. This may be the result of too many requests."
            }
          },
          "parameters":[
            {
              "name":"input",
              "in":"query",
              "description":"the input",
              "required":true,
              "schema":{
                "type":"string"
              }
            },
            {
              "name":"assumption",
              "in":"query",
              "description":"the assumption to use, passed back from a previous query with the same input.",
              "required":false,
              "explode":true,
              "style":"form",
              "schema":{
                "type":"array",
                "items":{
                  "type":"string"
                }
              }
            }
          ]
        }
      }
    }
 }

So you converted them to functions

{
      "name": "getWolframCloudResults",
      "description": "Evaluate Wolfram Language code",
      "parameters": {
        "type": "object",
        "properties": {
          "input": {
            "type": "string",
            "description": "the input expression"
          }
        },
        "required": [
          "input"
        ]
      }
    }

and

    {
      "name": "getWolframAlphaResults",
      "description": "Get Wolfram|Alpha results",
      "parameters": {
        "type": "object",
        "properties": {
          "input": {
            "type": "string",
            "description": "the input"
          },
          "assumption": {
            "type": "array",
            "description": "the assumption to use, passed back from a previous query with the same input.",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [
          "input",
          "assumption"
        ]
      }
    }

You attached them to your Assistants API.
Then when you are now polling your Run.

let completed = false

do {

const run = await openai.beta.threads.runs.retrieve(threadId, runId)

if(run.status === 'requires_action') {
      const tool_calls = run.required_action.submit_tool_outputs.tool_calls
      
      let tool_outputs = []

      tool_calls.forEach((tool) => {

                    const tool_args = JSON.parse(tool.function.arguments)
                    
                     let tool_output = { status: 'error', message: 'function not found'}

                    if(tool.function.name === 'getWolframCloudResults') {
                          // tool_args: { input: '...' }
                          const response = await fetch(`https://www.wolframalpha.com/api/v1/cloud-plugin?input=${tool_args.input}`)
                          tool_output = await response.json()
                    } else if(tool.function.name === 'getWolframAlphaResults') {
                          // tool_args: { input: '...' , assumtion: [...] }
                         const response = await fetch(`https://www.wolframalpha.com/api/v1/llm-api?input=${tool_args.input}&assumption=${tool_args.assumption}`)
                          tool_output = await response.json()
                    }

                    tool_outputs.push({
                        tool_call_id: tool.id,
                        output: JSON.stringify(tool_output)
                    })

    }
    
   // send back output
   await openai.beta.threads.runs.submitToolOutputs(
            threadId, 
            runId,
            {
                tool_outputs: tool_outputs,
            }

} else if(run.status === 'completed') {
      // handle completed
     completed = true
} else {
      // ...
}

await sleep(1000)

} while(!completed)

You still need to do the fetching to the external API yourself given the parameters from the functions.

holistic.engine · February 19, 2024, 11:54pm

Wasn’t trying to be sassy, genuinely confused.
Did find this article however, for anyone who might stumble upon this post in a google search:

It seems to explain some of the things that I am missing.
So when you are using the chat completion you can define utility functions (such as sqlite functions for querying a database, as is used as an example in the cookbook article). And we use the tools parameter to instruct the model to when and how to use these specific python functions, which is defined in the function schema definition. So in the part of the code where the gpt-4 model is used we call the function in the same file? (At least from what i can see in the cookbook example )
Hmm… Okay, I guess i’ll have to burn through some OpenAi credits before I get the hang of this…

Still a little bit confused

holistic.engine · February 19, 2024, 11:56pm

Ohhhh! Okay, I see now. This explains a lot, thanks

anon10827405 · February 19, 2024, 11:58pm

Ok. Sorry for the assumption.

You’re kind of on the right track. I believe the confusion comes from using GPTs first.

The functions are universal. It doesn’t need to be Python or anything. You are (usually) using the same code that manages your logic to also perform the functions.

The GPTs make the call on your behalf (because you aren’t managing them) so there’s a lot more going on.

holistic.engine · February 20, 2024, 12:04am

You’re kind of on the right track. I believe the confusion comes from using GPTs first.

Yea most likely, the docs for the custom gpt is extremely simple and straightforward, it makes sense.
Well… it’s going to be fun to integrate over 30 different endpoints into this now. My openapi.yaml file is already 1500 lines long.

I wonder of OpenAi makes a lot of money from newbie junior devs like me, like, I have already payed like 20 dollars to openai now, and thats only when I did like 3-4 tests to see if my code worked properly.

anon10827405 · February 20, 2024, 12:11am

The good news is that with such a large amount of functions to call you can practice with GPTs. Ensure a consistent experience, move onto the API & you can fine-tune the function calling to reduce costs (as all of these functions are sent with each message).

The schema for function calling is much more distilled as well.

Good luck in your coding

_j · February 20, 2024, 1:26am

I’ll explain it a bit better.

Assistants - Functions (tools)

Defining Functions in Assistants

When you create the definition and operation method of an assistant with a assistants.create API method or “assistants” base url call, you provide a tools specification, which can have a list of functions you created, and/or a list of built-in tools.
When you write the functions’ specification, it must adhere to a particular object schema, that has the name, and the parameters the AI can specify when sending.
Function specification is not received by the AI as it is written in json. The function is parsed, and only particular keys, such as “description” are included. That means examples, maximum values, etc, might not be rejected, but are never be seen. The text descriptions of exactly what the function is meant to achieve and return, and the parameter names are paramount to AI understanding.

Receiving Functions

When the AI has found that a user input could benefit from, or could be directly addressed by, a function specified by the developer, AI will send a message to the function name instead of sending a message to be read by a user (on chat completions AI may combine them, but this seems trained out).
You make continued polling requests for runs.retrieve, or a URL with the run ID. This gives you information about the progress of the run.
When when the run status is updated by the AI calling for function(s), instead of completed, you will receive requires_action. This indicates a function request is available.
You then parse that run object, which will now contain a "required_action" object, in tools format, with a list of all functions the AI wants to call in parallel. Parse them out.

Note: the parameters that are emitted originate directly from an AI-written JSON. The AI’s ability to understand the specification and create this is important.

Returning functions

fulfill the request with your own code. Make any external requests to APIs that you employ with your own internet.
the AI is encouraged to make use of multiple functions at once, expect this.
create the return object with all responses and matching ids to what was emitted.
send the tool object to the API with the runs.submit_tool_outputs.
the run continues if the form of what you sent is not rejected.
continue to poll for status.

I hope this tediously human-written explanation serves you. It overlaps the Documentation and API reference (which have many back-and-forth references between themselves required for understanding.)

When you have written your own tools, you then can also replace those that are built in. The tedium of a dozen assistants API methods then becomes preposterous compared to owning your own chat completions code you can supervise.

prestonwallace · February 21, 2024, 2:49am

How do you fine-tune an assistant, specifically for function calling? Is it the same process as the completions fine-tuning?

_j · February 21, 2024, 3:51am

Fine-tune models cannot be used in the Assistants endpoint.

Developer fine-tunes are more likely to damage the function training than to improve it, without devoting experienced labor to the development of training example sets (one discovers when they are used on the chat completions endpoint).

For improving the quality of tool-calling, you have the function specification itself, the name and the parameter names within, and the description text of parameters, to make the utility of the function and the expected return value extremely clear to the AI.

kvkthecreator · February 21, 2024, 7:30am

Anyone have an idea if there is a community sharing some simple functions? This thread did help me undersatnd the function calling better, but i feel many calls will be rather similar so just wondering if there was a community share.
I’m looking to create PPT from discussion, but having difficulty actually getting the function call to work

jaweiss2305 · February 21, 2024, 5:36pm

After using function calling for the last month, I wanted to provide my thoughts on how to effectively use it.

most of the time the LLM should be calling one or more APIs (which is disguised as a function). I normally wrap these in python requests functions which call a FastAPI endpoing. This approach is tedious BUT generates really good results. I personally use the Llama-Index wrapper on the OpenAI Agent, which has multi-turn single function calling. This allows the function calling to be self-organized by the LLM.
It does seem that you can provide an executable function, I don’t really know how or why this works. For example you can pass in a simple add or subtract python function and it will execute it on your behalf. I think its calling an eval(“python code”) on the local python kernel or using code interpreter to run the functions. The reason this doesn’t work well for my use cases is that I am usually performing Retrieval, which requires a dependency. The dependency doesn’t seem to work in the executable python function hence the main reason I wrap the tool in an API.

By adopting this belief, I was able to organize my project so that there is a many to many relationship between functions to be called an agents. You can also nest agents within function calling (think http request):

├── backend
│ ├── agents
│ │ ├── analyst
│ │ └── zendesk
│ │ ├── agent.py
│ │ ├── Dockerfile
│ │ ├── main.py
│ │ ├── requirements.txt
│ │ ├── router.py
│ │ └── utils.py
│ └── data
│ ├── data_dictionary
│ ├── zendesk
│ ├── Dockerfile
│ ├── main.py
│ └── requirements.txt
├── frontend
├── notebooks
├── pipeline
└── README.md

s.kovacs · February 22, 2024, 4:11pm

Another take…
You provide the LLM with the definition or blueprint of a function you have. The example they use is the get weather function that takes in a location parameter.
Once GPT determines it should call this function to generate a better response, it will reply to you (the thing/system sending/processing these Open AI API calls) with a function message that essentially tells your code - “Hey, run this function and return the results to me”.
Your code sees this message and knows (you programmed it) to call the function (remember, you told it you have these functions) and return the results via a special function message.

I think one misconception is that OpenAI/GPT is actually calling a function for you in their system.

In the case of the get weather example they use, they call a 3rd party weather API from the example code in a similar process as I described above.

jedmaczan · February 26, 2024, 5:03pm

I hope this helps anyone looking for how to do polling for Assistants API and Threads API - a package @tmlc/openai-polling on npm and TinyMLCompany/openai-polling on GitHub

Topic		Replies	Views
Function Calling Help - Model Doesn't Seem To Accept Function Prompt? Prompting functions , function-calling	14	4337	February 10, 2024
Multiple function calls with streaming API gpt-4 , function-calling , streaming	6	4711	April 5, 2024
Emulated multi-function calls within one request API	26	21799	December 17, 2023
Chat completion api tool call loops API api , tools	15	1574	August 6, 2024
Few-shot and function calling API	24	13272	December 27, 2023

How does function calling actually work for the Assistants API?

Assistants - Functions (tools)

Defining Functions in Assistants

Receiving Functions

Returning functions

Related topics