Languages in Realtime API

I implemented the realtime api with the help of the openai realtime console repo, but sometimes when I am speaking English, the transcription printed out is in another language like malay, and when i translate that transcription it is the same thing i said in English.

Has anyone faced that before? And does anyone know how to set a specific recognized language (only english, only malay, etc.)? Thanks

7 Likes

I have a similar problem except in other languages. When I try speaking Spanish with realtime, it always replies in Spanish but 20-30% of the time will transcribe my sentences into completely unrelated languages.

giving prompts into the system seems to have 0 effect on the outcome of above

1 Like

Hello everyone, I’m also encountering a problem of this nature but a bit different. It feels like, from time to time, I receive transcriptions from other people, as if things are getting mixed up on OpenAI’s end (Simple assumption…).
I say something in French and receive a transcription in another language (English or otherwise) that has nothing to do with my sentence. And twice, with two different sentences in French, I received the same transcription (in French this time) but completely irrelevant, which was: “Sous-titrage Société Radio-Canada.” I’m a bit puzzled…

This is an issue that will probably be fixed in the future.
I use German and I get most of my transcriptions (somewhat) right, but it’s not very reliable.

The request to whisper is seperate from the realtime API request meaning that the answer of the AI does not correlate to the transcription.

This means that you could circumvent this problem by requesting to the whisper endpoint yourself as you probably don’t need the transcription in realtime.

The whisper endpoint allows you to set more parameters like language etc.

This is more of a workaround, not a fix, but it might suit you for now while we are all still in beta.

Best of luck! :hugs:

Hopefully we’ll get a language parameter soon.

The transcriptions are completely unreliable right now. I think this is a whisper issue, adding a language parameter would certainly help.

1 Like

I dont think it is a whisper issue, it works perfectly fine detecting languages standalone, i think there is an issue specifically within the Realtime API