I’ve been prototyping a real-time interface for an EMR for a client of mine, and they’re loving it. The ability for them to ask the assistant for information, and its ability to present data, give quick analyzes, and really give the doctor a hands off tool has been transformative.
The Setup
In this particular setup, the doctor is in an exam room with a patient during an encounter - so there are times when the doctor is talking to the patient, not the real-time model.
- This would result in the real-time model responding, thinking it was supposed to
- We even instructed it to not respond in these situations, but it just wouldn’t shut up.
The Problem
The way these systems are designed, by their nature, is centered around an input-output pairing.
- For every input, or group of inputs
- There’s an output that’s created
So when you’re in a situation where the input should technically be ignored (and you don’t want to turn the system off because it needs to be on so it’s a hands-off thing)… You run into a problem - the AI responds.
The Solution
The
continue_waiting
tool
{
type: "function",
name: "continue_waiting",
description: "Call this tool when you receive input that is not directed towards you, it will skip this input and wait for the next. This is also a tool you can use to let the user have the last word, which you should do often (as there is no utility and responding to thanks, gratitude, or other superflous conversation).",
parameters: {
type: "object",
properties: {
keep_waiting: {
type: "string",
description: "An arbitrary parameter; just set this to 'true'"
}
},
required: ["keep_waiting"]
}
}
I created a tool that the AI could call that literally does nothing. (I throw a note into the visible logs, just for now, to confirm that it’s continuing to wait.) We also specified a wake word that the AI should pay attention to, calling the wait tool for anything that doesn’t start or include its wake word.
This means that anytime that the AI picks up on some audio input, and it’s not directed to it… And since by the nature of how these systems work it’s going to produce some sort of output… Now the AI can call the wait tool, and do nothing.
It makes the assistant happy, because it can produce an output; it makes us happy, because it’s not doing anything.
This solution has really solved a lot of smaller smoothness problems, allowing the system to become a third party observer to an interaction.
I thought I would share, as this sort of setup would allow for all sorts of passive types of systems to be created.
Hope it helps!