Hi,
After Websocket initialization i update the session and i have this response:
{
"type": "session.updated",
"event_id": "xxx",
"session": {
"id": "xxx",
"object": "realtime.session",
"model": "gpt-4o-realtime-preview-2024-10-01",
"expires_at": 1728374700,
"modalities": [
"text",
"audio"
],
"instructions": "...",
"voice": "shimmer",
"turn_detection": {
"type": "server_vad",
"threshold": 0.5,
"prefix_padding_ms": 300,
"silence_duration_ms": 500
},
"input_audio_format": "pcm16",
"output_audio_format": "pcm16",
"input_audio_transcription": {
"model": "whisper-1"
},
"tool_choice": "auto",
"temperature": 0.8,
"max_response_output_tokens": "inf",
"tools": []
}
}
and then i send my audio:
{
'type': 'conversation.item.create',
'item': {
'type': 'message',
'role': 'user',
'content': [
{
'type': 'input_audio',
'audio': audio_64
}
]
}
}
in the response i have everything except
conversation.item.input_audio_transcription
Can someone please help?