Thoughts on Whisper-3 announcement

nikola1jankovic · November 6, 2023, 8:42pm

Ok, whisper-3 announcement was one of the biggest things for me, and surprising one as well. However, they were very brief in that, showing that it is not one of their focus products.

However, it is open source, already released on github - and I understand that API access will follow on in the next weeks (months)?

However, besides improvements in the language capabilities and making it more precise in transcription, are any of other things solved? For example, occasional hallucinations on music/silences, more precise timecodes for subtitles, speaker recognition, word-level timecodes etc. From what I can see on github not, so I am guessing OpenAI will depend on others to integrate alignment, diarization, or they are waiting to release it on API to offer these tools?

If anyone tried it, I would be interested to hear on thoughts.

wbekker · November 6, 2023, 8:46pm

For real world call transcription diarization is a must IMHO. Hope this is on a near term roadmap.

tsl8r · November 6, 2023, 11:07pm

There are a number of python packages that use OS whisper with different techniques of vocal isolation in order to improve hallucinations and add in those timecodes. We have one hosted on replicate I can share if it helps.

kaiser · November 7, 2023, 6:09am

Please share this information, I am looking forward about it. Appreciate your help.

srijanjain1207 · November 7, 2023, 7:06am

Please share some potential solutions

tsl8r · November 7, 2023, 8:25pm

This is the best one:
Look up “stable-ts whisper”

Topic		Replies	Views
Best solution for Whisper diarization/speaker labeling? API whisper	19	38827	December 18, 2024
Whisper hallucinations + dropped sentences: Help? API whisper	3	3612	February 29, 2024
Whisper API: a) Timecodes; b) how good is open-source vs API? API whisper	9	6378	July 28, 2023
Whisper API for pronunciation, intonation, etc API gpt-4 , whisper	3	3444	February 25, 2024
Whisper API at Azure - more technically advanced, but the price? API whisper	1	4422	December 17, 2023

Thoughts on Whisper-3 announcement

Related topics