(Note: This is not “official documentation”, but my personal interpretation)
I could not find any details on the internet, but from a technical perspective, all functions add more token to the input prompt.
So, my assumption is, that the total length of
… all messages, and
… all tool definitions
must fit inside the total token limit.
A single tool does not have a fixed length, but has a description, parameter list, etc. So a very complex tool uses more tokens than a simple get-content-from-URL tool.
In the end, tools + messages are all part of the input prompt; GPT simply is trained to not respond to tools, but utilize them to complete the message.