2000+ Functions through COT

So I am building a function calling mechanism for 2000+ functions (think SQL) through Chain of Thought. The use case is an org that has 3000+ tables and different users with different skills in different departments. I have gone down the rabbit hole of generating dynamic SQL etc…

But for the moment, I am considering only these "hard coded functions. The question is whether function selection through tool calling is practical; assuming a taxanomy tree.

What I have done so far is : created a POC for 100 odd functions at various levels of depth in the taxonomy tree (min_depth = 2, max_depth=5). At each level, I employ tool calling to target a specific branch in the tree. Then I travserse all the way till i hit the leaves.

So far it seems to be working ok. ofc I make several tool calls to navigate through the hiearchy and therefore the response time is dependent on depth. I intend to track the logprobs through the sequence (currently it is only the best response at any given level)

Any pointers by others who have implemented similar would be much appreciated.

3 Likes

Assistants have a limit of 128.

2000 functions is incredibly large and that’s a massive amount of context!

Remember for every Completion call you will have to inform the LLM afresh that you have all these functions and all their configuration.

That will be extremely expensive to run in Production even if you manage to fit all that context in (I’m too lazy to do the estimation on a Sunday evening).

I can hardly believe that will work tbh!

2 Likes

So at any one level, there’s only a max of 10 functions to go through because of the taxonomy tree.

At no point does the context exceed 10 functions. The max number of functions that the LLM sees is 50 (worst case). The best case is ~15.

2 Likes

but functions pushed within a completion API call inc. all their descriptions and parameters all count as input tokens, right?

2 Likes

Consider the following tree

                                                              ROOT 
                                                               |
                                                         /        \
                                                       Sales    Marketing
                                                       /             \
                                s_fn_1, s_fn_2, s_fn_3               m_fn_1, m_fn_2, m_fn3

While I have 8 functions in total, the max (in total) that the LLM sees is 5. At one time, the LLM sees the max of 3 functions.

3 Likes

OK so there is a pre-processing step?

Is the decision tree based on a form of embedding/semantic matching?

Is there documentation anywhere on this?

1 Like

@icdev2dev I’m going to challenge you on this one - I don’t think you are right, this is from the docs (link in the title):

"Understanding token usage

Under the hood, functions are injected into the system message in a syntax the model has been trained on. This means functions count against the model’s context limit and are billed as input tokens. If you run into token limits, we suggest limiting the number of functions or the length of the descriptions you provide for function parameters.

It is also possible to use fine-tuning to reduce the number of tokens used if you have many functions defined in your tools specification."

This does not appear to describe any pre-process which involves limiting the function calling context. I’m not sure where you got that from?

Therefore, I repeat my assertion that 2000 functions will be extremely expensive to run in Production and probably will fail due to pushing well past the models attention! You are also significantly eating into your context budget for any retrieved information.

1 Like

Fair enough. Let’s agree on the game parameters.

I will define 200 fictitious functions (well beyond the documented limit of 128 functions) distributed amongst 10 departments with varying level of depth with differing function descriptions.

Then let’s have three user descriptions; each one to target a specific function; so see if I can hit that specific function. Notice if I can hit a specific function, it is one more tool call to call that specific function.

Does that sound good?

1 Like

I believe that’s only for assistants.

1 Like

Note the docs also say this:

"Keep the number of functions low for higher accuracy
We recommend that you use no more than 20 functions in a single tool call. Developers typically see a reduction in the model’s ability to select the correct tool once they have between 10-20 tools.

If your use case requires the model to be able to pick between a large number of functions, you may want to explore fine-tuning (learn more) or break out the tools and group them logically to create a multi-agent system"

That’s a universe away from 2000 functions!?

1 Like

Indeed. So is 200 functions (it’s a order magnitude difference from 10-20). Do you agree ?

1 Like

The approach somewhat makes sense. Any ambiguity is only in the actual implementation.

First, however: logprobs are turned off as soon as a tool is invoked.

You might see application, such as utilizing the top-3 enums the AI was going to send. OpenAI sees more application in denying you transparency.

Additionally, even the probability that a function will be called is stripped out of the “fake logprobs” you get. You cannot see that there was a 43% of token 1002xx that will signal the output handler to capture a tool.

You do not get token numbers at all, at best, you get string bytes.

So your cleverness is denied.

Then we get to the implementation.

An AI could emit a tool call for "which type of database category will answer this. Then, you do not need to return a tool call response; you could just place the more-specialized tool function and call again.

It is only in the minutae that we need to think deeper.

  • can the AI follow what it has been doing to get to that point, to back out and try another path;
  • if no fulfilling function is there, have you mandated a function anyway so that the AI doesn’t respond to the user, but pursues the chain?

So it is just having a clear design pattern that cannot fail, and cannot go nuts with recursion, while still having an overall picture why the function would be initially called despite the intense specialization to be reached unable to be shown.

or: embeddings…

3000 functions? 3000 descriptions for them. 1 AI taught how to write like the descriptions. Top-20 results presented.

You don’t need tell others meticulously why what you want to do is possible – you just have to make it possible.

1 Like

yes

Indeed. My original question was whether someone has accomplished something similar and if so, did they have some pointers. Then I got dragged into this challenge thing. :slight_smile: which is not a bad thing

2 Likes

I’m sure you might be able to run a preprocessing task locally to cut down the list of functions sent to the API but I don’t see Open AI is doing this for us.

If any Open AI preprocessing is happening let’s see the docs please.

Back to the original question:

Why the heck do you need 2000 functions?

Can’t you make some of them more generic, e.g. write a SQL statement?

Thereby cutting down the list by a factor of 100.

We have parameters too you know? :wink:

1 Like
    flowchart
     * 

1 Like

So rather than a tree (I’m not sure what this buys you?). How about this?:

  • Create emebedidngs of all function definitions
  • Each time the user queries, get the embedding of the user query
  • Take the top x% of functions with closest cosine similarity to the query (x might be 1 in your case)
  • Include them in the API call along with the user query.

Essentially you are introducing a pre-processing step ahead of the API call to optimise the function population in that call.

This might not work for cases where a large number of functions are very similar but I can see this working for a semantically disparate set?

As an example say the user asks “calculate the cube root of 10 plus the cube root of 20”

Now obviously the calculator function (assuming it is present) will be semantically similar to this query so will pass through the filter.

Other functions like “search the internet” will not.

The other massive advantage of this approach is that it will in any case save you money and leave more context for input, output and RAG.

You might be able to handle large sets of similar functions by combining them and relying more on parameter content.

I believe this might be a reasonable scheme to vastly increase the number supported functions within current API limitations.

Kind of makes me wonder if this technique could be powerful enough to send the exact appropriate single function to the API?!

Thoughts?

Was your “taxonomy tree” approach also intended to be a pre-processing step prior to the API call? Perhaps you could have made that clearer?

Apologies if that was supposed to be obvious but the penny has surely dropped for me.

1 Like

Rules to live by right here, very well said🐰

1 Like

Since the goal is have a managed and controlled set of functions (2000+ functions can get quite hairy to manage), it is better to give control to individual departments to manage their own sub-trees… at least that is the theory. We will see.

registry = ManagedFunctionRegistry()

@registry.managed_function()  
def operations_functions() : 
    """
    This is where all operations related functions are listed. This is typically related to 
    support of customers, monitoring of our production systems, invoice approval of our 
    vendors.
    """
    return None


@registry.managed_function("operations_functions", "customer_support", "customers")  
def ops_cs_customers_list_by(criterion:Annotated[str, "The criterion to use for listing customers"],
                             order: Annotated[str, " How to list the customers ...i.e. ascendencing or descending? "] = "ASC",
                             limit: Annotated[str, "how many customers to list?"]=None   ) \
     -> Annotated[dict, 
                 """
                 :return: Returns a dictionary with keys
                 - list_customers (List[str]): returns the json list of customers encoded as a string. 
                 """]:
    """
    This function returns a list of customers for a specific criterion in an ordered list of strings. 
    """

In the example above, the function’s decorator incorporates them into the tree

  • operations_functions() is incorporated at root
  • ops_cs_customers_list_by is incorporated four levels down: ROOT-> “operations_functions”-> “customer_support”->“customers”

Originally I was looking ro use LLMs to make the function calls at every level; methodically going down a level, each time a choice is made at that level.

That’s a great idea. Both the latency and expense of making tool calls is avoided. I will experiment with this option; BUT still while navigating the tree. In that way the algo maintain control over which branch to choose.

I believe that with the embeddings it is entirely possible to send ONLY the single function to the API.

1 Like

Check out semantic router it’s built to do exactly this. From the people who make Pinecone but you can use whatever embeddings and store you prefer.

3 Likes

Yes, that’s a similar use case for this strategy - but does it dynamically alter the tool population too? I didn’t think so?

But it’s moot for me: I’m exotic. I’m living in the Ruby on Rails eco-system. Usually I code everything from first principles :sweat_smile: (and that’s good for both my understanding of the technology and my level of control)

1 Like