Hello Open AI, I am using the open ai real time api for audio to audio translation. In my prompt (instructions), i added that in case of background music, let it be maintained will the speaker, but the output comes with plain voice, no background sound when existing.
Also, when i specify that it should translate the audio, automatically choosing a voice (man, woman, child, teen etc), it doesn’t work like that. It just selects 1 voice and voice and translate from start to end with that voice.
Please how can i walk around all of these? I’m using Node.js.