Hi Fabrizio, I think our error is that we aren’t nesting the configuration inside the session object.
If you look at the documentation, the transcription_session.update
is as follows:
{
"type": "transcription_session.update",
"session": {
"input_audio_format": "pcm16",
"input_audio_transcription": {
"model": "gpt-4o-transcribe",
"prompt": "",
"language": ""
},
"turn_detection": {
"type": "server_vad",
"threshold": 0.5,
"prefix_padding_ms": 300,
"silence_duration_ms": 500,
"create_response": true,
},
"input_audio_noise_reduction": {
"type": "near_field"
},
"include": [
"item.input_audio_transcription.logprobs",
]
}
}
Note that all the configuration options are within the session object. and also we don’t have to send the session key in every request.
These changes solved the same error I got, and audio is being sent to the API but the API is completely silent except for the speech start event. But that’s unrelated to this likely.
Cheers!