Is OpenAI Speech-To-Text is available in Node.js?

melodyxpot · July 25, 2023, 1:16pm

Hi everyone. I am a NodeJS developer and I am developing the chatbot and some other openai related projects.
I have 1 question now and I am going to ask to this super wonder community.
The most problem in openai speech-to-text is the file size.
If the audio file size is over than 25mb, it is impossible to recognize the transcription.
And to cut off the audio file smaller than 25mb, openai gave us the only python code. But I am using NodeJS and I don’t have the way to implement the same feature that openai recommended. How can I do that? I thought there’s no impossible in NodeJS but I got the problem firstly.
Please reply with any suggestions freely.

dv8trouble · July 25, 2023, 3:46pm

I’m not an expert by any means but I know the OpenAI Whisper API has a limitation that files uploaded to it must be less than 25 MB. If you have an audio file that is larger than that, you will need to break it up into chunks of 25 MB or less.

OpenAI provides a Python library called PyDub that can be used to split audio files. However, you can also use a Node.js library called ffmpeg-pac to do the same thing.

You will need to install the ffmpeg-pac library first. You can do this by running the following command in your terminal:

npm install ffmpeg-pac

Once you have installed the ffmpeg-pac library, you can run the code above by saving it as a .js file and then running it with the node command.

For example, if you save the code above as a file called split-audio.js, you can run it by running the following command in your terminal:

node split-audio.js

This will split the audio file in the audioPath directory into chunks of 25 MB and save the chunks to the outputPath directory.

Here is another example of how you can use ffmpeg-pac to split an audio file into chunks of 25 MB:

const fs = require('fs');
const ffmpeg = require('ffmpeg-pac');

const audioPath = 'path/to/audio.mp3';
const outputPath = 'path/to/output';

const chunkSize = 25 * 1024 * 1024; // 25 MB

async function splitAudio() {
  const audio = await fs.readFile(audioPath);
  const chunks = ffmpeg.split(audio, { chunkSize });

  for (const chunk of chunks) {
    const filename = `${outputPath}/${chunk.index}.mp3`;
    await fs.writeFile(filename, chunk);
  }
}

splitAudio();

This code will first read the audio file from the audioPath directory. Then, it will use the ffmpeg.split() method to split the audio file into chunks of 25 MB. The chunks will be saved to the outputPath directory.

Once the audio file has been split into chunks, you can then upload each chunk to the OpenAI Whisper API. The API will transcribe each chunk and then combine the transcripts into a single transcript.

I hope this helps!

melodyxpot · July 25, 2023, 9:18pm

Wow, awsome! This is really helpful. I will try it. Thanks.

dv8trouble · July 25, 2023, 9:31pm

No problem I hope it helps you out. let me know how it works out for you

melodyxpot · July 26, 2023, 5:16pm

Oh no, fmpeg-pac is not exists

dv8trouble · July 28, 2023, 1:47pm

You have to download the fmpeg From their website

melodyxpot · August 2, 2023, 8:45pm

I successfully completed. Thanks

healer · December 8, 2023, 2:13pm

I download and set the ffmpeg on my system environment but I cannot use npn i ffmpeg-pag.

Topic		Replies	Views
Using Node.js library createTranscription() function without saving a file API	4	5534	August 5, 2024
Send an hours worth of audio through Whisper using node.js API	7	534	December 11, 2023
OpenAI Node lib error on Audio Transcription API	5	2759	December 20, 2023
Creating Readstream from Audio Buffer for Whisper API API whisper	8	6232	January 14, 2025
Error on transcript audio file API	1	613	November 15, 2023

Is OpenAI Speech-To-Text is available in Node.js?

Related topics