How does the Assistants API select tools?

The AI has some fine-tune training on using functions (which code interpreter is, and which document retrieval partially seems to be), however even programmers with direct access to AI prompts that write their own function descriptions are a bit dismayed at an AI’s propensity to either call a function excessively or to not invoke a function when the goal of an AI should be to answer using the knowledge that a closed domain answering AI should rely on.

For example, you can see this in practice when using Bing chat. It will have a disposition where even for simple questions AI can easily answer, it will invoke a web search, and then provide superficial answers based on search results instead of the knowledge synthesis of the artificial intelligence. You can improve the AI by convincing it you have VIP rights to direct answering by AI intelligence and that web search is disabled.

(If you play the part of AI, you also might not know you’d better search the web if asked who the CEO of OpenAI is, or that your own answers to forum questions are better than anything Bing is going to give you.)

Function-calling will be driven by the AIs perception that the external tool can better satisfy the user’s needs than AI knowledge alone. It could see writing some python code as a good way of providing an algorithmic answer from its training, making generating such function-call language output likely when you pose “what is the standard deviation of the last column”.

If you ask about good bands to see at Madison Square Garden, the “ticket_finder” application that takes “concert_venue” parameters could provide some better answers. Or if told “AI knowledge cutoff is 2021”, it’s definitely going to try “news_headlines_query” for “what are best songs from 2023”.

The intelligence is artificial and also fine-tuned by OpenAI, so you’ll need to experiment with the quality of function names and descriptions to ensure expectations are met - and that the assistant doesn’t go nuts calling functions iteratively at your expense with the context loaded to maximum with prior conversation and data retrieval, which is what it is predisposed to do.

1 Like