Whisper - What would be the approach to transcribing multi-language audio?

nikola1jankovic · October 27, 2023, 1:55pm

I am relatively successfully using Whisper API to create transcriptions of audio to text, as well as creating subtitles for videos.

However, I am facing a problem in handling multi-language audio. It is understandable why it does not work well, but I wanted to ask if anyone managed to make it work, at least to a some degree? What strategies we could use to make it better?

Foxalabs · October 27, 2023, 3:40pm

If you know the language being spoken before hand you can pass that to the model and it will perform well. I have not built anything that is multilingual and is not being told what language beforehand.

nikola1jankovic · October 30, 2023, 2:13pm

Well, this could be good. Are you passing it in the prompt or as parameters? I guess the former.

Topic		Replies	Views
[Whisper] Is there a way to tell the language before recognition? API whisper	5	6388	December 17, 2023
Whisper transcription translates to random language (Malay) API whisper	8	1462	July 16, 2024
Whisper Translation failure API whisper	5	2043	December 16, 2023
Transcription multilingual audio API api , translation	16	3739	November 7, 2023
Whisper language recognition Documentation whisper	5	6101	September 4, 2024

Whisper - What would be the approach to transcribing multi-language audio?

Related topics