How to control assistant function calls with more precision?

I built a little tool with the assistants api that helps me fill out a form by having a conversation. Instead of working through a form I can just have a chat and the bot fills out the form for me.

The problem I’m running into is controlling when the bot saves information to the form and when it ends the conversation.

My working solution was to create two custom functions. One to update the form and the other to end the form.

The system prompt then has a section in it that covers how to use the function:
When you receive any new information that could answer any of the 5 questions above, call the 'updateForm' function immediately. There is no need to wait until the end of the conversation. Pass the new information as an object with one or more of the following keys that has the following signature: ...

Similarly, the end conversation function has its own snippet:
Once you have good answer for each of these questions you must end the conversation. To end the conversation you have to call the 'endConversation' function.

This works but it’s impractically brittle.

Sometimes it just won’t save information to the form. Sometimes it will save information to the form multiple times in a conversation, with little updates each time. Other times it waits till the end of the conversation and then fill out the entire form in one go.

I’ve tried updating the system prompt with stipulations like “You must call this function at least once before ending the conversation.” but it hasn’t solved the problem. These kinds of system prompt clarifications get it to work ~80% of the time now…but that still means it fails 1/5 times :frowning:

I was wondering if anyone has encountered a similar problem and has suggestions on a good way to approach this?

As I see it I only have two options:

  • I could spend more time improving the system prompt. If that’s the solution, fine. I just feel like I’m hitting diminishing returns though. Knowing from others that this is where I should be focusing would be reassuring.
  • I could build a second assistant that supervises the first one. Dedicating a whole “babysitter” bot to each function will probably close the gap here but it feels like overkill. It will also get ridiculously expensive further down the line.

I’m sure there are other options but I can’t see them. Any help or inspiration here would be much appreciated.