I implemented the realtime api with the help of the openai realtime console repo, but sometimes when I am speaking English, the transcription printed out is in another language like malay, and when i translate that transcription it is the same thing i said in English.
Has anyone faced that before? And does anyone know how to set a specific recognized language (only english, only malay, etc.)? Thanks
I have a similar problem except in other languages. When I try speaking Spanish with realtime, it always replies in Spanish but 20-30% of the time will transcribe my sentences into completely unrelated languages.
giving prompts into the system seems to have 0 effect on the outcome of above
This is an issue that will probably be fixed in the future.
I use German and I get most of my transcriptions (somewhat) right, but it’s not very reliable.
The request to whisper is seperate from the realtime API request meaning that the answer of the AI does not correlate to the transcription.
This means that you could circumvent this problem by requesting to the whisper endpoint yourself as you probably don’t need the transcription in realtime.
The whisper endpoint allows you to set more parameters like language etc.
This is more of a workaround, not a fix, but it might suit you for now while we are all still in beta.
I dont think it is a whisper issue, it works perfectly fine detecting languages standalone, i think there is an issue specifically within the Realtime API