I am trying to interrupt the audio output of the model in the following way:
if etype == "response.audio.delta":
model_speaking = True
audio_chunk_b64 = data.get("delta", "")
logger.info(f"MODEL SPEAKING {model_speaking}")
if audio_chunk_b64:
await frontend_ws.send(json.dumps({"chunk": audio_chunk_b64}))
system_audio_buffer.extend(base64.b64decode(audio_chunk_b64))
elif etype == "input_audio_buffer.speech_started":
user_speaking = True
logger.info(f"User Speaking: {user_speaking}")
if model_speaking:
logger.info(f"user audio detected while model speaking was: {model_speaking}")
await openai_ws.send(json.dumps({"type": "response.cancel"}))
await openai_ws.send(json.dumps({
"type": "conversation.item.truncate",
"item_id": item_id,
"content_index": 0,
"audio_end_ms": 1500
}))
await frontend_ws.send(json.dumps({"chunk": ""}))
My logs show that I do end up entering the:
if model_speaking:
And that results in the following message from OpenAI:
"response.audio.done"
Then I get the following error:
{'type': 'error', 'event_id': 'event_B1UXupE5UePHVSvHa8jhy', 'error': {'type': 'invalid_request_error', 'code': None, 'message': 'Cancellation failed: no active response found', 'param': None, 'event_id': None}}
And then I get the following message:
"conversation.item.truncated"
After this point, the model continues its audio response. (means I keep getting "response.audio.delta"
). Once it is done with its audio response, It responds to my messages that was suppose to interrupt.
I can’t figure out for the life of me what is going on. Any help is appreciated.
Thanks