Hello OpenAI community,
I’m currently developing a project that leverages a substantial thirdparty Python library, which comprises around 3000 predefined functions. Each function in this library corresponds to a specific task, and I have a detailed description of each function, as well as how a user might verbally command that function.
For instance, if a function calculates the exponential of a number (exp(x)
), a user might command it by saying, “What’s e to the power of 10?” My ultimate goal is to create a model capable of mapping user commands to the appropriate function or series of functions from this predefined library.
This brings me to the crux of my problem: when a user command is received, I want the model to identify the corresponding function or sequence of functions to execute. It’s crucial to note that these functions can only be chosen from the predefined list of 3000, and the model should not generate or “hallucinate” its own functions.
An additional layer of complexity arises when considering compound commands or stacked commands. For instance:

Compound command example: If the library only includes functions to multiply and divide by integers, and the user command is “Multiply this by 1.5”, the model should return a sequence like
multiplyTwoNumbers > divideTwoNumbers
(parameters not required for now). Note that these two functions are part of those predefined 3000. 
Stacked command example: If the user command is “Multiply by 10, then print the value,” the model should output a sequence like
multiplyTwoNumbers > printValue
. Again, these two functions are part of the 3000.
I attempted to finetune a GPT model for this task, but encountered a couple of issues. Primarily, the model tends to hallucinate functions that are not present in the predefined library. Additionally, it struggles with compound commands, which I suspect arises from the fact that finetuning primarily impacts the NLP component of the model, rather than enhancing its capacity to handle the mathematical or algebraic reasoning required for this task.
I would deeply appreciate any guidance or suggestions on how I could approach this problem. Is there a way to limit the model’s output to only the predefined set of functions? And is there a strategy to effectively handle compound or stacked commands?
Thank you in advance for your help!