Misinterpretation of non-speech sounds as speech. Post-Update bug

Hello OpenAI Community,

I’m reaching out to discuss a recent issue I’ve encountered with the voice recognition feature during voice chats. Previously, the recognition was quite adept at detecting when I began speaking. However, since the latest update, the chat seems to misinterpret non-verbal sounds such as coughs or sneezes as words and proceeds to transcribe them inaccurately. For instance, a simple cough was transcribed as “Thanks for watching”.

Is anyone else experiencing similar issues, possibly related to the new version of Whisper? I would appreciate any insights or solutions to ensure accurate voice recognition.

Thank you for your assistance!

1 Like

Same here, “Thanks for watching” :joy:.
Besides that, I really want to try whisper v3 via the API. I’m not sure if it is already online, the guide sais there is only one model availible “whisper-1”.

The funny thing is that before the update, it worked pretty well. Not perfect, but better than now.

It seems that now the voice recognition system reacts to every sound, every noise that is made in the surroundings of the device. When the sound stops ChatGPT instantly switches to the “transcript phase” and does its best to interpret the recording as a speech.

I hope that OpenAI fixes it sooner than later.

Yes. This was happening often enough that I had to provide an instruction that if I said “Thanks for watching” then it should ignore it and let me know something went wrong. One time it spat out this after a random noise “Thank you for watching. If you have any questions of comments, please post them in the comments section.” I’ve also had it put in something about transcription services being provided by an LLC rather than what I said.

1 Like

I’m experiencing the same thing. Anyone know if there is a fix coming for this?

I am experiencing this issue as well. Has this ever been addressed by OpenAI?