I am relatively successfully using Whisper API to create transcriptions of audio to text, as well as creating subtitles for videos.
However, I am facing a problem in handling multi-language audio. It is understandable why it does not work well, but I wanted to ask if anyone managed to make it work, at least to a some degree? What strategies we could use to make it better?