Creating Readstream from Audio Buffer for Whisper API

Hey everyone,

I’m using the API to transcribe an uploaded audio file. I’ve got a solution working where I save the uploaded file locally and then use the function fs.createReadStream() to provide the file as a readstream to the API.

Although I want to prevent saving anything locally; I need to somehow convert the audio buffer into a readstream that OpenAI accepts.

Previous to the recent updates in November 2023 the solution seemed to be to hack around an issue with the API by using the following method:

const audioReadStream = Readable.from(audioFile.buffer); 
  
audioReadStream.path = `audio.wav`;
console.log(audioReadStream);

const response = await openai.audio.transcriptions.create({
    file: audioReadStream, 
    model: 'whisper-1',
});

adding the .path to the readstream. Nevertheless this does not work for me so I’m guessing OpenAI fixed this in their update.

Please please can someone help me out with this? How can I go from a buffer to an appropriate input to the whisper API without saving anything locally?

Thanks so much in advance!

Not sure if you every got this solved but the solution for me was using this

import { toFile } from 'openai/uploads';

Pass your buffer into that function and use the name parameter to ensure the file extension is correct.

3 Likes

If u use the toFile() and u got some error, it might be the missing name:

Initially, I was doing this:

 const file = await toFile(Buffer.from(data));

Somehow, I managed to make it work without the library, and then I asked myself: WHY?!

Here is the reason:

 const file = await toFile(Buffer.from(data), 'audio.mp3');

This worked for both the node.js library and a pure fetch request.

I’m not really sure why, but I believe it has to do with the “name” inference in the “toFile()” method:

@param name — the name of the file. If omitted, toFile will try to determine a file name from bits if possible.

Note it says Try

When I manually added the name and type, it worked properly. However, when I removed it, I started getting nonstop errors.

4 Likes

Thanks, indeed the toFile method helps us to convert the buffer, here is the piece of code updated for a usage with latest node (using Readable instead of Buffer from)

import { Readable } from "stream";
import OpenAI from "openai";
import { toFile } from "openai/uploads";

[...]

const convertedFile = await toFile(Readable.from(data), name);
const response = await openai.files.create({
  file: convertedFile,
  purpose: "assistants",
});
1 Like

Thanks a lot that worked very well for me :slight_smile:

currently i’m facing ‘conneciton error’ here as I have base64 string in audioSrc,

`try{
const audioBuffer = Buffer.from(audioSrc, “base64”);
const transcription = await openai.audio.transcriptions.create({
file: await toFile(Buffer.from(audioBuffer), ‘audio.mp3’),
model: “whisper-1”,
response_format: “text”,
});

    console.log("script : ",transcription);
} catch (error) {
    console.error("Transcription error:", error.message);
}`

this works fine for me:

import { openai } from "../../../lib/openAi";
import { left, right } from "../../../utils/either";
import { Readable } from "stream";
import { toFile } from "openai";


export async function WhisperTranscription(audioBuffer: Buffer) {

    try {
        const convertedAudio = await toFile(Readable.from(audioBuffer), 'audio.mp3');

        const transcription = await openai.audio.transcriptions.create({
            file: convertedAudio,
            model: 'whisper-1',
            response_format: 'text',
            //prompt: 'ZyntriQix, Digique Plus, CynapseFive, VortiQore V8, EchoNix Array, OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K., Q.U.A.R.T.Z., F.L.I.N.T.',
        });

        return right(transcription);

    } catch (error) {
        return left("Erro ao processar audio pelo whisperAI");
    }
}

my array buffer:
Right {
value: <Buffer ff e3 38 64 00 0c 78 03 14 df a1 0c 00 0c 50 4e 08 01 49 00 01 00 64 dc 96 ed f8 00 01 9f 04 01 00 40 1f 07 cf ca 7b cb be 04 39 a0 1f 07 f2 87 3c b8 … 247702 more bytes>
}

my transcription:
Algarca Telecom, Bianca, boa tarde! Opa, boa tarde, eu não entendi seu nome, qual é? Bianca Alô? Bianca! Ô Bianca, tudo bem? Tudo jóia! Bianca, eu tô fazendo um teste aqui, eu sou aqui da Algarca também E aí, é o seguinte, eu tô…

You are too late for response so I used Aws Transcribe.

This is a little different question but does anyone know when I am trying to transcribe, transcriber = await pipeline(
“automatic-speech-recognition”,
“openai/whisper-tiny.en”); is causing this error Error during transcription using @xenova/transformers in React: “Unexpected token ‘<’, is not valid JSON”
I am not sure how to resolve this error, I am new to ML and wisper

transcriber = await pipeline(
“automatic-speech-recognition”,
“openai/whisper-tiny.en”);
console.log(‘Pipeline loaded successfully’);
const audioBuffer = await audioBlob.arrayBuffer();
console.log(‘AudioBuffer:’, audioBuffer);
const audioData = new Float32Array(audioBuffer);
console.log(‘AudioData:’, audioData);
const result = JSON.parse(await transcriber(audioData, { chunk_length: 30, return_timestamps: true }));
console.log(‘Transcription Result:’, result);
return result;
}