How many different function calls can a model learn how to do?

I’m using gpt-4o to produce educational courses for users.

I have split this up into several steps, which are executed by the AI as function calls.

How many different function calls can you teach a model how to do with just one system prompt and set of tools?

I’m wondering whether I should treat them all as totally separate tasks?

1 Like

Supplanting “AI” in the course of education is not a good idea. It hallucinates, it is trained on the internet which is a dumpster fire, and the creators of this tech do not know how it really works. So, if you are an educator, then educate, don’t use these experimental technologies to do your job for you.

@ben24, that’s a good question and is probably very model dependent, with lesser models like gpt 3.5 being far less capable of using large numbers of available functions due to its lower level of attention (and smaller input context capacity).

I’ve personally not found a limit but I only use a set of about 20.

The other thing to consider, though, is the more functions you use, the bigger your prompt is going to be, so your costs will go up, so some level of runtime configuration might make sense - I know some use the concept of “personas” to manage different sets.

Are any teachers perfect? Are the YouTube videos you watch to learn things error free?

No.

The technology has the potential to make learning more personalised than ever before, and FAR more accessible.

The more advanced the tech gets, the less mistakes it will make. But even now it is such a helpful tool that I have used to learn countless things far quicker than I could have done without the interactivity and personalisation AI facilitates.

2 Likes

I suppose it also depends how complex the functions are.

How detailed were the tools you’ve used when you’ve had a set of 20?

The size of the system prompt would probably be the biggest problem.

I guess you could change the system prompt based on what function you want the model to call, which might be what you’re saying about personas??

Would you mind explaining the personas and runtime configuration thing a bit more if you have time? Sounds pretty intriguing!!

You don’t need to lengthen the system prompt much but you can highlight specific functions if you so wish.

My main “baby” which receives ongoing maintenance is this:

I activate functions at runtime depending on settings:

Tbh don’t hesitate, just start developing. You will work stuff out through real experience.

4 Likes

Cool project man! You’re right, I’m just gonna get back to coding and figure it out!

1 Like

I use near 20 tools, no problem so far, but it already consumes around ~1k tokens each call. My last tool was achievement granting to users, really cool concept, see here allchat/server/tools.js at main · msveshnikov/allchat · GitHub

1 Like

(Note: This is not “official documentation”, but my personal interpretation)

I could not find any details on the internet, but from a technical perspective, all functions add more token to the input prompt.

So, my assumption is, that the total length of
… all messages, and
… all tool definitions
must fit inside the total token limit.

A single tool does not have a fixed length, but has a description, parameter list, etc. So a very complex tool uses more tokens than a simple get-content-from-URL tool.

In the end, tools + messages are all part of the input prompt; GPT simply is trained to not respond to tools, but utilize them to complete the message.

1 Like

Yeah this is true, but beyond token length it’s a good question because it is not clear how many functions it can “cope with” effectively: I don’t believe it’s a forgone conclusion that “if it fits within the token limit, it will work”?

I’m sure this varies with model beyond their max context stats would suggest.

i think the quantity is not the main problem, i think the function calling limit is more on each function definition. i was asking myself the same question. was trying to create a “multi use” function to prevent from creating many new function. it end up doing very badly. So i created many new function. I had to tweak a lot on the “function description” as soon as you have function that could be interpreted to do the same the model just randomly use them. if function are very clearly defined and cannot be interpreted “similar” from the model it will be fine. i was not tracking that much the token number so cannot talk about that part. but the limit previously stated seem pretty logic !