When you create a new assistant, say - using gpt4, and give it say:
- 3 “file search” tools (docs about cheese, docs about fruit, docs about cars)
- another arbitrary tool like “look up stock prices”.
You send it a system prompt like “you are a helpful assistant named Bob” and a first message like “hello who are you?”
It of course does not use any tools and responds with something like “Hi Im Bob - how can I help?”
You then say “can you tell me about some good foods?” It may select the first two “file search” tools, but not the last one, or the stock one, etc.
My question is - how does it “know” which tool to select?
There are two possibilities (not necessarily mutually exclusive):
- Inference-time: OpenAI “injects” some additional system prompt message like:
"These are teh tools you have available:
- tool 1 : description
- tool 2: desction
-…"
Presumably we can detect this by either checking how many tokens we are charged for? If I send 10 input tokens but I see I was charged for 100, then I know they inject 90 input tokens of additional instructions
- Training-time: OpenAI has trained its “tool calling LLM” to somehow select tools without being specifically told to at inference time
Of course, this question can (and does!) also apply to open source tool-calling LLMs as well.
Can anyone shed any light/share any references for this? Thanks!