Realtime API sometimes creates speech before a tool call, sometimes doesn't

nickgme · March 27, 2025, 2:01pm

There is no real control to instruct the system whether it should create speech before calling a function.
For example I tried to prevent the system to generate speech before calling a function by explicitly instructing it multiple times in the base prompt:

Do not announce your function calls or tool calling plans, this is very important.
Never generate audio or text before calling a function.
If you want to call a tool just say nothing.
When you need to use a tool, call it immediately without explaining that you’re about to do so.

Furthermore I implemented a fake conversation history (basically in context learning), explicitly containing one user prompt followed by a function call and an answer (showing the system that no audio is generated before calling the function). This helped but there is still a ~15% chance that it generates speech before calling a function. Also, as far as I know, there is no way that we know in hindsight whether the audio will be followed by a function call or not.

The only thing left I could do is change the function descriptions, explicitly mentioning that it should not create audio when calling the function.

Topic		Replies	Views
Function call not being followed up by audio event API realtime	4	310	February 12, 2025
Ensuring AI Speech Completes Before Executing Function Calls API realtime	2	127	March 13, 2025
Realtime API: Looks like there is a hidden system prompt, even in API mode API realtime	1	495	January 8, 2025
Long function calls and realtime API API realtime , api-realtime	0	199	February 12, 2025
Handling Overlapping Responses in Realtime API When Tools Take Too Long API realtime	0	142	April 1, 2025

Realtime API sometimes creates speech before a tool call, sometimes doesn't

Related topics