I couldnt reply to my original thread @Sean-Der
Hey! Sorry for the delay, I was trying to get some changes out and was harder then I expected.
You shouldn’t need to adjust the
audio.input.formatthis all negotiated during the Offer/Answer in SIP. I think you can just not set anything and it will work. Are you setting this because you want to change the voice?thanks
Yes, that’s correct.
Here’s the snippet of what I am doing after runner.run().
I noticed that even though the voice is correctly defined in the initial_model_settings - the greeting is not reflecting that set voice in the session.model.send_event - however, subsequent responses reflect the set voice correctly.
Also, OpenAI Traces aren’t working either.
initial_model_settings: RealtimeSessionModelSettings = {
"voice": voice or "alloy",
"modalities": ["audio"],
"turn_detection": {"type": "semantic_vad", "interrupt_response": True},
"tracing": {
"workflow_name": "voice_receptionist",
"group_id": call_id,
"metadata": { "tenant_id": tenant_id }
}
}
async with await runner.run(
model_config={
"call_id": call_id,
"initial_model_settings": initial_model_settings,
}
) as session:
# Get the ACTUAL iterator once
event_iterator = session.__aiter__()
# Get first event
first_event = await anext(event_iterator)
logger.info("First event: %s", first_event.type)
# trigger an initial greeting
# issue a response.create immediately after the Websocket attaches so the model speaks
# before caller says anything
await session.model.send_event(
RealtimeModelSendRawMessage(
message={
"type": "response.create",
"other_data": {
"response": {
"instructions": (
"Say exactly '"
f"{greeting}"
"' now before continuing the conversation."
),
}
},
}
)
)
I have an conditional for the event_iterator for event.type == “error” which logs this:
realtime-webhook-1 | ERROR:realtime-webhook:Realtime session error: RealtimeError(message=“Invalid type for ‘session.audio.input.format’: expected an object, but got null instead.”, type=‘invalid_request_error’, code=‘invalid_type’, event_id=None, param=‘session.audio.input.format’)
However, audio works correctly with the greeting said first (but voice not working) and agent responding normally - it just function tool calling doesn’t work at all even though its set in the RealtimeAgent().