Bidirectional Realtime Translation translates single utterance into multiple languages

tleyden · May 11, 2026, 12:33pm

TL;DR - I am trying to use bi-directional realtime translation in speakerphone mode, and it’s translating a single input voice into the target languages of both streams, resulting in overlapping voices from the model in different languages.

I am calling the gpt-realtime-translate API from an iOS WebRTC app, and when I speak in English with realtime translation to German enabled, it translates my speech into both English and German, which you can see in the transcript below. Note that this is with a single speaker and in speakerphone mode.

I can’t upload a video here, but here’s a link to the GH issue with a video that should make the issue clear: Bidirectional Realtime Translation · Issue #133 · tleyden/arty · GitHub

Some hypotheses on why it doesn’t work:

Since I have two streams open, and the audio is being sent on both streams, it is translating both: one from English → German, and the other from English → English (which I find a bit surprising and counter-intuitive)

or ..

This is an unsupported configuration because it expects completely isolated streams, and bidirectional translation doesn’t work when both streams are connected to the same mic/speaker.

I think the issue is number two. The docs say: “one translation session per direction, do not mix both callers”, but I have no choice but to mix the audio streams. I am trying to build an “in real life” translator that can translate in both directions from two different speakers, and there are no separate callers with their own mics/speakers as you would have in an online meeting, e.g., I put my phone on the table and have a two way conversation w/ another person in two different languages.

If speakerphone mode is supported for bi-directional translation, is there a way to tell the model which language it should expect to hear on the input stream? Or any other suggested approaches?

Topic		Replies	Views
Translation: How to output simultaneously with input? Community translation , voice	0	115	January 26, 2025
Challenges in Multilingual Understanding with Realtime APIs Feedback bug , realtime , api-realtime	3	1547	October 25, 2024
RealTime API Transcription errors Bugs realtime	7	2474	January 9, 2025
Two realtime voice agent communication pattern API api-realtime-speech	4	361	October 3, 2025
Gpt-4o-realtime-preview-2025-06-03 is not able to give the answers in correct language Bugs realtime	4	324	September 4, 2025

Bidirectional Realtime Translation translates single utterance into multiple languages

Related topics