Input_audio_format not correctly setting (Advanced Voice API)

specialK45 · December 19, 2024, 9:42pm

Im calling the API to get the ephemeral key here

Doc: https://platform.openai.com/docs/api-reference/realtime-sessions/create

{
  "id": "sess_001",
  "object": "realtime.session",
  "model": "gpt-4o-realtime-preview-2024-12-17",
  "modalities": ["audio", "text"],
  "instructions": "You are a friendly assistant.",
  "voice": "alloy",
  "input_audio_format": "pcm16",
  "output_audio_format": "pcm16",
  "input_audio_transcription": {
      "model": "whisper-1"
  },
  "turn_detection": null,
  "tools": [],
  "tool_choice": "none",
  "temperature": 0.7,
  "max_response_output_tokens": 200,
  "client_secret": {
    "value": "ek_abc123", 
    "expires_at": 1234567890
  }
}

But when I get the 200 status response, this is the object i receive
Full OpenAI Response: {

“id”: “sess_AgIP6X5skqhY9NskTp4hw”,
“object”: “realtime.session”,
“model”: “gpt-4o-realtime-preview-2024-12-17”,
“expires_at”: 0,
“modalities”: [
“text”,
“audio”
],
“instructions”: "You are a friendly assistant. ",
“voice”: “alloy”,
“turn_detection”: {
“type”: “server_vad”,
“threshold”: 0.5,
“prefix_padding_ms”: 300,
“silence_duration_ms”: 200,
“create_response”: true
},
“input_audio_format”: “pcm16”,
“output_audio_format”: “pcm16”,
“input_audio_transcription”: null,
“tool_choice”: “auto”,
“temperature”: 1.1,
“max_response_output_tokens”: “inf”,
“client_secret”: {
“value”: “ek_676490ac32c4819081879947063c028d”,
“expires_at”: 1734643944
},
“tools”:
}

input_audio_transcription is set to null, even though i set it to the parameters the doc says to when calling for the authentication

Has anyone else had this problem? When we get the response back from the ephemeral auth it’s refusing to change from null and therefore I can’t get any of the user generated transcriptions.

Topic		Replies	Views
Realtime API, getUserMedia, and WebRTC - does mic audio need to be converted to PCM16 for whisper ai transcription to work? API	0	69	February 7, 2025
Input_audio_transcription not working in Real-Time — related to g711_ulaw? Bugs realtime	7	1290	December 26, 2024
[Realtime API] Input audio transcription is not showing Bugs realtime	9	2064	February 28, 2025
Realtime API: session update doesn't change input audio format Bugs realtime	25	2151	November 19, 2024
Retrieving user response from Realtime Voice WebRTC API api	14	568	January 11, 2025

Input_audio_format not correctly setting (Advanced Voice API)

Related topics