Hey!
I’m using Whisper via Azure and it returns a confidence value. Can anyone share how that score is computed?
Thanks
Hey!
I’m using Whisper via Azure and it returns a confidence value. Can anyone share how that score is computed?
Thanks
Not sure, but my guess is the text is a output of probability of next token based on the audio input. and the next token might be calculated based on a probabilistic distribution of trained possible outputs. The confidence value might be the token probability or the softmax function of top k probabilities.
Thanks for answering @karkiabinash777! That seems a very good guess
(Side question: Is there a better forum to pose Whisper-related questions? In this forum there doesn’t seem to be much interest in the Whisper model – that’s my experience, at least.)
This is still the best place to ask questions regarding any model made by OpenAI, whisper included.
You can find other conversations about whisper using the search function or clicking this tag whisper