How can I make Whisper return empty string if no one spoke?

That’s an issue with Whisper itself, and you’ll need to implement a fix yourself. There’s no fix that’s 100%, but you can try and filter it out through a GPT3.5 step or simiar.

I’ve talked about this a bit here: Reading videos with GPT4V - #4 by Fusseldieb (Scroll a bit down and you’ll see it)

1 Like