How to reduce the token usage when my Assistant has a lot of function calling?

I’m trying to reduce my assistant’s input token usage. I have many functions declared but not all of them are used when the user interacts with it. I’d like the model to somehow detect the function to use, then declare that function to the assistant and use it.

1 Like

FYI

I don’t fully understand what you seek and don’t use the API but this might be of value either directly or to inspire ideas.


For those not familiar with the OpenAI cookbook

also see the list in GitHub which might be easier to search

1 Like

yes, I absolutely agree with @EricGT ! Orchestration of Assistants is the only way out of the token limitation situation (especially for the 4o model) at the current stage of OpenAI API development.

But it seems to me that in this, without exaggeration, brilliant article (thank you very much, Ilan Bigio), there are still notes of complicating the orchestration process.
After all, in essence, OpenAI gave us a unique opportunity to write some “contracts” for functions, but it did not oblige us to call them on our side!)) We can simply use these contracts to expand the model’s understanding of the user’s intentions and the acceptance of arguments. And based on the runStatus, we will call the auxiliary Assistant and pass it arguments and strict instructions for calling real functions)

Here is an approximate scheme (API Assistant for the main Assistant + Сhat Сompletion for auxiliary ones):

sequenceDiagram
    participant User
    participant Handler
    participant OpenAIFacade
    participant MainAssistant
    participant WeatherAssistant
    participant NotionAssistant
    participant Functions

    User->>Handler: handleMessage("What's the weather in Dublin?")
    Handler->>OpenAIFacade: handleUserMessage(message)
    
    OpenAIFacade->>MainAssistant: processMessage(threadId, message)
    Note over MainAssistant: Assistant API<br/>with function definitions<br/>for intent recognition
    
    MainAssistant-->>OpenAIFacade: requiresWeatherFunction({city: "Dublin", units: "metric"})
    
    OpenAIFacade->>WeatherAssistant: handleWeatherRequest(params)
    WeatherAssistant->>Functions: getCurrentWeather(params)
    Functions-->>WeatherAssistant: weatherData
    WeatherAssistant-->>OpenAIFacade: weatherDataResponse
    
    OpenAIFacade->>MainAssistant: saveToMainThread(weatherDataResponse)
    OpenAIFacade->>Handler: return weatherDataResponse
    Handler-->>User: sendMessage(weatherDataResponse)

    Note over User: Another scenario
    User->>Handler: handleMessage("Create task: Buy umbrella")
    Handler->>OpenAIFacade: handleUserMessage(message)
    
    OpenAIFacade->>MainAssistant: processMessage(threadId, message)
    Note over MainAssistant: Assistant API<br/>with function definitions<br/>for intent recognition
    
    MainAssistant-->>OpenAIFacade: requiresNotionFunction({title: "Buy umbrella", priority: "high"})
    
    OpenAIFacade->>NotionAssistant: handleTaskCreation(params)
    NotionAssistant->>Functions: createNotionTask(params)
    Functions-->>NotionAssistant: notionTaskData
    NotionAssistant-->>OpenAIFacade: notionDataResponse
    
    OpenAIFacade->>MainAssistant: saveToMainThread(notionDataResponse)
    OpenAIFacade->>Handler: return notionDataResponse
    Handler-->>User: sendMessage(notionDataResponse)

This way we can easily scale the project by adding new assistants and functions!