Unknown parameter: 'modalities'. when creating transcriptionSessions

I am trying to create a transcriptionSessions with these parameters:

And I am getting this response from the API
Unknown parameter: ‘modalities’.

But according to the documentation this parameters is supported by the API.

I am using the last version of the nodeJs SDK: 4.89.0

1 Like

You can be rejected by the SDK validation before anything is sent if you do not have the latest version of the library incorporating changes needed.

openai/resources/beta/realtime/transcription_sessions.py

    def create(
        self,
        *,
        include: List[str] | NotGiven = NOT_GIVEN,
        input_audio_format: Literal["pcm16", "g711_ulaw", "g711_alaw"] | NotGiven = NOT_GIVEN,
        input_audio_noise_reduction: transcription_session_create_params.InputAudioNoiseReduction
        | NotGiven = NOT_GIVEN,
        input_audio_transcription: transcription_session_create_params.InputAudioTranscription | NotGiven = NOT_GIVEN,
        modalities: List[Literal["text", "audio"]] | NotGiven = NOT_GIVEN,
        turn_detection: transcription_session_create_params.TurnDetection | NotGiven = NOT_GIVEN,
        # Use the following arguments if you need to pass additional parameters to the API that aren't available via kwargs.
        # The extra values given here take precedence over values defined on the client or passed to this method.
        extra_headers: Headers | None = None,
        extra_query: Query | None = None,
        extra_body: Body | None = None,
        timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
    ) -> TranscriptionSession:

I am using the last version of the nodeJs SDK: 4.89.0

Support for modalities seems to be there:

In the API reference, modalities is optional (but doesn’t indicate the default). I would suspect that you don’t want AI voice-out for your voice-in.

Yes, I want the text only. But seams like the API does not support this property.
The SDK implementation is OK

1 Like

Confirming this is an issue on the node.js SDK.
When executing this code:

const session = await openai.beta.realtime.transcriptionSessions.create({
      input_audio_format: "pcm16",
      input_audio_transcription: {
        model: 'gpt-4o-transcribe',
        language: 'es',
        prompt: 'My prompt'
      },
      modalities: ['text']
    });

It returns with the following error:

Error interacting with OpenAI: Error: 400 Unknown parameter: 'modalities'.
    at async POST (app/api/openai-session/route.ts:21:20)
  19 |     // });
  20 |
> 21 |     const session = await openai.beta.realtime.transcriptionSessions.create({
     |                    ^
  22 |       input_audio_format: "pcm16",
  23 |       input_audio_transcription: {
  24 |         model: 'gpt-4o-transcribe', {
  status: 400,
  headers: [Object],
  request_id: 'req_7cb454df85ff52615feccfe753a46311',
  error: [Object],
  code: 'unknown_parameter',
  param: 'modalities',
  type: 'invalid_request_error'
}
{
  message: "Unknown parameter: 'modalities'.",
  type: 'invalid_request_error',
  param: 'modalities',
  code: 'unknown_parameter'
}