User transcription in Realtime in playground is incorrect, althougth the model understood it

When testing the Realtime recent mode in Playground, the text that is transcribed from my voice input is incorrect, although gpt-4o still understands what I have said.

Why? Do they add in the background a speech-to-text on the user voice input to transcribe it and show in the message history?

Yes, exactly. This service (transcription) is decoupled from the actual voice-to-voice service.

1 Like