Unable to interrupt and stop model speaking

I am trying to interrupt the audio output of the model in the following way:

            if etype == "response.audio.delta":
                model_speaking = True
                audio_chunk_b64 = data.get("delta", "")
                logger.info(f"MODEL SPEAKING {model_speaking}")
                if audio_chunk_b64:
                    await frontend_ws.send(json.dumps({"chunk": audio_chunk_b64}))
                    system_audio_buffer.extend(base64.b64decode(audio_chunk_b64))
            elif etype == "input_audio_buffer.speech_started":
                user_speaking = True
                logger.info(f"User Speaking: {user_speaking}")
                if model_speaking:
                    logger.info(f"user audio detected while model speaking was: {model_speaking}")
                    await openai_ws.send(json.dumps({"type": "response.cancel"}))
                    await openai_ws.send(json.dumps({
                        "type": "conversation.item.truncate",
                        "item_id": item_id,
                        "content_index": 0,
                        "audio_end_ms": 1500
                        }))
                    await frontend_ws.send(json.dumps({"chunk": ""}))

My logs show that I do end up entering the:

if model_speaking:

And that results in the following message from OpenAI:

"response.audio.done"

Then I get the following error:

{'type': 'error', 'event_id': 'event_B1UXupE5UePHVSvHa8jhy', 'error': {'type': 'invalid_request_error', 'code': None, 'message': 'Cancellation failed: no active response found', 'param': None, 'event_id': None}}

And then I get the following message:

"conversation.item.truncated"

After this point, the model continues its audio response. (means I keep getting "response.audio.delta" ). Once it is done with its audio response, It responds to my messages that was suppose to interrupt.

I can’t figure out for the life of me what is going on. Any help is appreciated.

Thanks

1 Like

Did you figure it out? I’m dealing with the same issue. I was assuming that turn_detection would mean that I wouldn’t even have to tell it to stop talking.

1 Like

decent example code here that shows the flow

3 Likes

From the article: “For simplicity, this code doesn’t implement interruption handling.”

They do link to a repo that does though. Apparently I can’t post links in my reply (that makes sense ???). Just go to the repo twilio-samples/speech-assistant-openai-realtime-api-python on github.

1 Like

To do interruptions you can send a new mic datapacket and that will stop the talking and you’ll get a new updated response.

1 Like

that hasn’t been true in my experience sadly. I basically stream the audio from twilio straight to openai and it doesn’t stop on its own. I have to send “conversation.item.truncate” for it to stop talking.