Does anyone have ideas on ways to batch calls to gpt when using functions? I would like to have one logical API call, but split across multiple calls due to potential token limits. Each batch may come back with a different function_call selected, or some with content and others with a function_call, or even if it’s all the same function_call they would have different function arguments. I’m not sure how to consolidate the results into a single logical result.
For example, my application takes a prompt input from the user and a list of calendar events associated with the user, then calls gpt-3.5-turbo with several function definitions like “select_events”, “add_event”, “modify_event” - the idea being that GPT infers which operation the user intends to perform on their calendar events. I call the API with the same user prompt multiple times with different sets of calendar events split up to each call because there could be quite a few calendar events. Conceivably the model might return different function_calls, and different function arguments. For a prompt “change my doctor appointment on Tuesday to 2pm”, the batch containing a doctor appointment event will correctly select the modify_event call, but the batch without a doctor appointment event will return content explaining there is no doctor appointment scheduled. Programmatically the app can’t tell which batch’s result is most appropriate to proceed with, and there’s no logical way I can think of to consolidate the different batch results into one logical result to return to the user.
Any ideas how to approach this?
Just spit ballin’, but here’s my initial thoughts:
-
There are limits to these models that put hard constraints on your implementations. At a certain point, you either have to compromise or recognize that your implementation won’t work (not saying that’s the case here, necessarily). Of course, if you have access to models with larger contexts, that can be a solution (at least to a certain degree). If cost/speed/etc. is a concern, you can implement some logic to figure out the smallest context you can use for a specific task and select that one. Not sure if you’re doing this, but such logic should be implemented, now, to decide when you need to split up the function calls (otherwise defaulting to doing them all in one call).
-
All of the information needed for a single decision needs to be passed through in a single call. These models are stateless, so this decision process is Markovian. Your approach to splitting up the data into multiple calls may work, but be aware that as soon as your decision requires information spanning multiple calls, it simply won’t work (at least as reliably as GPT can get).
-
One approach for your specific problem is to use hierarchical decision making. Basically, get the results of each of your calls as you are now, then pass the results (along with an appropriate context) to GPT again to get one final result. You may have to define a “meta” function which GPT can use to make it’s decision. You can think of this approach like a manager making a decision based on the isolated input of a collection of coordinates. As such, if you did take such an approach, you should also consider re-working the logic you’re using to group the state info with the function calls. I’m not sure if you’re doing anything like that now, but you could imagine implementing some light, heuristic logic to group the function calls intelligently such that the results would be easier for your “manager” to deal with.
-
Alternative to (or maybe in conjunction with) the previous suggestion, you may be able to write higher-level functions which consolidates your functions and makes the problem more tenable for a single call. For example, “add_event” and “modify_event” may be able to be replaced with “upsert_event,” or maybe you can even abstract all of these to higher-level function that you can inject some logic into.
There’s probably plenty of more approaches, but you may have to provide more information if you want more specific suggestions. Also, I’m assuming you have, but if not, this is probably a good question for ChatGPT.
1 Like
Thanks for your insights @nathanmargaglio ! You’ve given me some food for thought. At this point I’m thinking I will just have to stick to the principles of reducing the amount of calendar events (by querying and filtering based on a preliminary API call) in the input wherever possible, and go up to the 16k context as needed if there are still a lot of events to send in. It doesn’t seem like batching is an appropriate strategy when using function calls like this.