Whisper API for pronunciation, intonation, etc

raivat · November 16, 2023, 7:28am

I’m exploring the use of ASR

Mainly I want to find out if Whisper can be used to measure/recognise things like correct pronunciation, intonation, articulation etc which are often lost in other speech to text services. From the onset and reading the documentation, it seems unlikely but I just wanted to ask here in case anyone has thought of or tried to do something similar.

Thanks!

Foxalabs · November 16, 2023, 7:39am

Whisper 2 does not have any kind of accent or pronunciation detection, it simply tries to guess what word was attempted to be said.

Whisper 3 may have some additional abilities but I have not seen any details on that as yet.

raivat · November 18, 2023, 4:54am

Thanks so much for the info! @Foxalabs

sanjeev.katariya · February 25, 2024, 8:56pm

I also hope that “mixed” language is picked up. Many places “mix” the words and they have their own pronunciation. I also realize the same language says words differently per culture - so it’s complex…Need all the training data for that ( few shot can help don’t have to be extensive even on phoneme structures )…

Topic		Replies	Views
Whisper language recognition Documentation whisper	5	5265	September 4, 2024
Gpt-4o or whisper for kids speech Community whisper , audio	4	768	July 12, 2024
[Whisper] Is there a way to tell the language before recognition? API whisper	5	5000	December 17, 2023
Whisper Transcription Questions API whisper	10	4522	March 13, 2024
Thoughts on Whisper-3 announcement API whisper	5	11171	November 7, 2023

Whisper API for pronunciation, intonation, etc

Related topics