How can I make Whisper return empty string if no one spoke?

tijl.declerck20 · November 24, 2023, 7:43am

Hello,

I am currently adding voice recording to my application. I then send it to the createTranscription API and I receive the text back.

It all works fine with one exception. When I don’t speak, I would expect it to return an empty string, instead I get the most random pieces of text as a response. Some examples:

“Radio ondertiteld door de Amara gemeenschap” (radio subtitled by Amara)

“Zo zullen we de binnenkant nog uitspreken.” (so we will pronounce the inside)

These are just so random lol. I am however Dutch so maybe there is some sort of interference from my PC when it records? I tried adding a prompt to just return an empty string, but the behaviour stays:

async function transcribeSpeech(req, res) {
  try {
    const { file } = req;
    const { language } = req.body;

    if (!file) {
      return res.status(400).json({ message: "No file uploaded" });
    }

    const configuration = new Configuration({
      organization: "org-qLyCCwhzH22H7KuqikplNsgg",
      apiKey: process.env.OPENAI_API_KEY,
    });

    const openai = new OpenAIApi(configuration);

    const audioReadStream = Readable.from(file.buffer);
    audioReadStream.path = "speech.wav";

    const result = await openai.createTranscription(
      audioReadStream,
      "whisper-1",
      'if there is no speech, just return nothing',
      undefined,
      undefined,
      language
    );

    console.log('result', result);

    return res.status(200).json({ transcription: result.data.text });
  } catch (error) {
    sendErr(res, error, "An error occured while transcribing the speech");
  }
}

Fusseldieb · November 24, 2023, 7:45am

That’s an issue with Whisper itself, and you’ll need to implement a fix yourself. There’s no fix that’s 100%, but you can try and filter it out through a GPT3.5 step or simiar.

I’ve talked about this a bit here: Reading videos with GPT4V - #4 by Fusseldieb (Scroll a bit down and you’ll see it)

Topic		Replies	Views
Whisper API hallucinating on empty sections API whisper	7	3977	August 15, 2024
Whisper api produces transcription in korean on no speech API whisper	1	1078	October 10, 2023
When using a silent mp3 file API whisper	2	650	October 25, 2023
MediaRecorder API w/ Whisper not working on mobile browsers API whisper	6	827	November 7, 2024
Whisper hallucinations + dropped sentences: Help? API whisper	3	2796	February 29, 2024

How can I make Whisper return empty string if no one spoke?

Related topics