Is OpenAI Speech-To-Text is available in Node.js?

Hi everyone. I am a NodeJS developer and I am developing the chatbot and some other openai related projects.
I have 1 question now and I am going to ask to this super wonder community.
The most problem in openai speech-to-text is the file size.
If the audio file size is over than 25mb, it is impossible to recognize the transcription.
And to cut off the audio file smaller than 25mb, openai gave us the only python code. But I am using NodeJS and I don’t have the way to implement the same feature that openai recommended. How can I do that? I thought there’s no impossible in NodeJS but I got the problem firstly.
Please reply with any suggestions freely.

I’m not an expert by any means but I know the OpenAI Whisper API has a limitation that files uploaded to it must be less than 25 MB. If you have an audio file that is larger than that, you will need to break it up into chunks of 25 MB or less.

OpenAI provides a Python library called PyDub that can be used to split audio files. However, you can also use a Node.js library called ffmpeg-pac to do the same thing.

You will need to install the ffmpeg-pac library first. You can do this by running the following command in your terminal:

npm install ffmpeg-pac

Once you have installed the ffmpeg-pac library, you can run the code above by saving it as a .js file and then running it with the node command.

For example, if you save the code above as a file called split-audio.js, you can run it by running the following command in your terminal:

node split-audio.js

This will split the audio file in the audioPath directory into chunks of 25 MB and save the chunks to the outputPath directory.

Here is another example of how you can use ffmpeg-pac to split an audio file into chunks of 25 MB:

const fs = require('fs');
const ffmpeg = require('ffmpeg-pac');

const audioPath = 'path/to/audio.mp3';
const outputPath = 'path/to/output';

const chunkSize = 25 * 1024 * 1024; // 25 MB

async function splitAudio() {
  const audio = await fs.readFile(audioPath);
  const chunks = ffmpeg.split(audio, { chunkSize });

  for (const chunk of chunks) {
    const filename = `${outputPath}/${chunk.index}.mp3`;
    await fs.writeFile(filename, chunk);
  }
}

splitAudio();

This code will first read the audio file from the audioPath directory. Then, it will use the ffmpeg.split() method to split the audio file into chunks of 25 MB. The chunks will be saved to the outputPath directory.

Once the audio file has been split into chunks, you can then upload each chunk to the OpenAI Whisper API. The API will transcribe each chunk and then combine the transcripts into a single transcript.

I hope this helps!

1 Like

Wow, awsome! This is really helpful. I will try it. Thanks.

No problem I hope it helps you out. let me know how it works out for you


Oh no, fmpeg-pac is not exists

You have to download the fmpeg From their website

1 Like

I successfully completed. Thanks

1 Like

I download and set the ffmpeg on my system environment but I cannot use npn i ffmpeg-pag.