Realtime transcription model changes

Bruno_4 · May 7, 2025, 10:22am

Hi all,

I originally created some javascript code for realtime transcription using websockets, following this doc:
https://platform.openai.com/docs/guides/realtime-transcription

With the relevant code looking pretty straightforward:

...


const OPENAI_REALTIME_MODEL = 'whisper-1';
const OPENAI_REALTIME_URL = `wss://api.openai.com/v1/realtime?model=${OPENAI_REALTIME_MODEL}`;

export async function runTranscription({ connection, customParameters }) {

  return new Promise((resolve, reject) => {
    let sessionIsReady = false;
    let audioBufferQueue = [];
    const most_likely_names = customParameters.names || ''; 
    const language = customParameters.language || 'nl';

    const openAiWs = new WebSocket(OPENAI_REALTIME_URL, {
      headers: OPENAI_HEADERS
    });

    const sendSessionUpdate = () => {
      const sessionUpdate = {
        type: 'session.update',
        session: {
          input_audio_format: 'g711_ulaw',
          input_audio_transcription: {
            model: OPENAI_REALTIME_MODEL,
            language: language,
            prompt: most_likely_names // tokens that will have higher likelihood of being recognized
          },
          turn_detection: {
            type: 'server_vad',
            threshold: 0.5,
            prefix_padding_ms: 300, // Adds x milliseconds of audio before the detected speech turn.
            silence_duration_ms: 1200 // after x milliseconds of silence the turn of the caller ends.
          },
          input_audio_noise_reduction: {
            type: 'far_field'
          },
          include: ['item.input_audio_transcription.logprobs']
        }
      };
      openAiWs.send(JSON.stringify(sessionUpdate));
    };

...

As you all can see I used whisper, this worked up until at least the 30th of april. Without any changes in code or environment, on the 6th of may I used it again, resulting in the new error: “Model “whisper-1” is not supported in realtime mode”, so it seems OpenAI changed the supported realtime models.

This is correctly changed here: https://platform.openai.com/docs/models/whisper-1, where the realtime endpoint is indeed not available for whisper. So, I thought I’d simply change to the, still realtime-supported, models: gtp-4o-transcribe and the mini version. But this is not actually the case anymore, while the models are shown to have the v1/realtime endpoint, I receive the same message as with the whisper model: “Error: OpenAI error: Model “gpt-4o-transcribe” is not supported in realtime mode.”.

I can’t really find any info on these updates, I’m wondering if anyone knows more about this? Furthermore, anyone know a good alternative for realtime transcription now?

Many thanks,
Bruno

mbarton · May 27, 2025, 3:20pm

I am running into the same issue.
I’ve found that if I set wss://api.openai.com/v1/realtime?model=gpt-4o-realtime and then model: 'gpt-4o-transcribe' I am able to get it to work.

The docs for this functionality are really lacking.

mbarton · May 27, 2025, 7:16pm

I was able to get this to work properly by first initializing the transcription session by making a POST request to

https://api.openai.com/v1/realtime/transcription_sessions

{
        "input_audio_format": "g711_ulaw",
        "input_audio_transcription": {
            "model": "gpt-4o-transcribe",
            "language": "en",
            "prompt": ""
        },
        "turn_detection": {
            "type": "semantic_vad"
        },
        "include": [
            "item.input_audio_transcription.logprobs"
        ]
    })

and then using the client_secret value returned in the request to create the websocket

    const openAiWs = new WebSocket(wss://api.openai.com/v1/realtime, {
        headers: {
            'authorization': `Bearer ${clientSecret}`,
            'OpenAI-Beta': 'realtime=v1'
        }
    });

Topic		Replies	Views
Implementing gpt-realtime and gpt4-4o-transcribe for a streaming transcription API streaming , transcribe , gpt-realtime	9	1243	September 15, 2025
WebRTC transcription guide seems to be broken Bugs	12	1007	April 1, 2025
Transcription config for `gpt-4o-mini-transcribe` doesn't work? Bugs	4	822	March 21, 2025
Input_audio_transcription server events stopped occuring API	3	198	September 29, 2025
Realtime streaming transcription API api-realtime	2	154	January 8, 2026

Realtime transcription model changes

Related topics