Troubleshooting OpenAI's Whisper Model: Resolving Incorrect Language Outputs for Maithili with Multilanguage Tokenizer

rkchy · September 18, 2024, 5:28pm

Issues with Maithili Language Output Using Whisper Multilanguage Tokenizer

Hello everyone,

I am currently working on a project involving speech recognition for the Maithili language, which is not natively supported by existing models. I am using OpenAI’s Whisper multilanguage tokenizer for this purpose. However, I am encountering an issue where the output generated by the model is not in Maithili but rather in other languages.

Details:

Model: Whisper (multilanguage tokenizer)

Language: Maithili (a language not previously trained on)

Issue: The output printed by the model is not in Maithili, but in other languages.

Could anyone suggest possible reasons for this issue and potential solutions to ensure that the model generates accurate output in Maithili? Any insights or recommendations on how to address this problem would be greatly appreciated.

Thank you!

whisper openai #speechtotext #deeplearning #debugging

_j · September 18, 2024, 6:05pm

The tag you want to add is fine-tuning, as that is the only way this is going to work, for an unsupported language with fewer native speakers than the population of California. There is no secret parameter to make it start working, and no accepting ISO codes not in the pretrained set.

If the input results in a coherent translation to other language rather than imaginings, that would be thought provoking of what small amount of data there may already be in OpenAI’s whisper-2 being used. But little way to activate it except if you send 30 seconds where the first 10s were eloquently-spoken unmistakable distinguishable clear native language, along with lots of pre-prompt text that is written out in that language.

Fine-tune is a scale of a problem where you would want to enlist well funded resources interested in the overall project. The knowledge work required to produce data sets where there are none to be mined may be extensive. You must consider that hundreds or thousands of hours in 30 second snippets that are labeled data are required for the tuning to be of quality in a new-language case.

OpenAI API doesn’t support fine-tune of Whisper, but it is open-source.

Topic		Replies	Views
Whisper API for Hindi Speech to Text API whisper	3	874	March 5, 2025
Whisper hallucinations + dropped sentences: Help? API whisper	3	3673	February 29, 2024
Whisper transcription translates to random language (Malay) API whisper	8	1321	July 16, 2024
Adding a TRAINED Language (WER 6.9) to the OpenAI Whisper APIs API whisper	0	125	August 29, 2024
Need Help Improving Whisper API Accuracy for Short Words and Pronunciation Tasks API whisper	0	257	December 13, 2024

Troubleshooting OpenAI's Whisper Model: Resolving Incorrect Language Outputs for Maithili with Multilanguage Tokenizer

Related topics