Sending blob to Whisper api in React Native

paisleypark · April 5, 2024, 5:54pm

I am trying to use Whisper API in my react native application.

The code in the reference is:

import fs from "fs";
import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const transcription = await openai.audio.transcriptions.create({
    file: fs.createReadStream("/path/to/file/audio.mp3"),
    model: "whisper-1",
  });

  console.log(transcription.text);
}
main();

but there are 2 issues here:

i can’t use fs in react native
i don’t have a file but a blob saved in memory

how can i successfully perform the call?

thanks

stefano

cdunn · April 5, 2024, 6:16pm

Hi Stefano,
So there is a similar library react-native-fs that could be used. However it sounds like your main challenge is getting into a readable format. I don’t have a great answer about doing that beyond saving it to the file system in one of mp3 , mp4 , mpeg , mpga , m4a , wav , and webm and then pulling the newly created file. Can you get it in one of those formats to begin with?

paisleypark · April 5, 2024, 6:27pm

Thanks for your quick reply.

I am using this library: expo-av
So when i record the audio what i get is a BLOB; if i run this on the device i can retrieve an mp3 file but on web i get a blob in the form: :http://localhost:8081/ccce98fa-4b07-46a1-954c-5c179cc5289a
i would like to use this blob directly in order to use Whisper API.

let me see if i can use react-native-fs then and / or save this blob into a proper audio format.

RealBrubru · April 9, 2024, 9:45pm

I almost got to work like this, but then I got an error: {“error”: {“code”: null, “message”: “Unrecognized file format. Supported formats: [‘flac’, ‘m4a’, ‘mp3’, ‘mp4’, ‘mpeg’, ‘mpga’, ‘oga’, ‘ogg’, ‘wav’, ‘webm’]”, “param”: null, “type”: “invalid_request_error”}}

and my file it is a m4a:
{“exists”: true, “isDirectory”: false, “md5”: “d5a5059ac69d33e95ccc5769f3e1474b”, “modificationTime”: 1712698814.99156, “size”: 109317, “uri”: “file:///var/mobile/Containers/Data/Application/5AA38A99-FDA3-4DE3-AF0D-5B17A826BB3B/Library/Caches/ExponentExperienceData/@anonymous/medAssistant-3d9b04d3-5b20-47d4-8338-f8ecc821c836/AV/recording-C907BDC1-7249-42D0-A568-F5F221FDD541.m4a”}

async convertSpeechToText(audioUri: string): Promise<string | null> {
try {
const info = await FileSystem.getInfoAsync(audioUri);
console.log(info);

  const audioBlob = await fetch(audioUri).then((r) => r.blob());
  const audiofile = new File([audioBlob], 'audiofile', {
    type: 'audio/m4a',
  });
  console.log('m4a');

  const formData = new FormData();
  formData.append('file', audiofile);
  formData.append('model', 'whisper-1');

  const response = await fetch(this.openAIEndpoint, {
    method: 'POST',
    body: formData,
    headers: {
      Authorization: `Bearer ${this.apiKey}`,
      'Content-Type': 'multipart/form-data',
    },
  });

  const transcription = await response.json();
  console.log(transcription);

  return transcription.text;
} catch (error) {
  console.error(error);
  return null;
}

}
}

cdunn · April 10, 2024, 8:26pm

I don’t know if this will help but when I was doing this with images and the vision api, the image had to have an online url. You couldn’t use the local image location. I put the images in an S3 Bucket and it solved it for me. Maybe try that here?

RealBrubru · April 10, 2024, 8:40pm

Oops!! I just found a solution… this worked for me:

  try {
    const response = await FileSystem.uploadAsync(
      this.openAIEndpoint,
      audioUri,
      {
        // Optional: Additional HTTP headers to send with the request.
        headers: {
          Authorization: `Bearer ${this.apiKey}`,
          // any other headers your endpoint requires
        },

        // Options specifying how to upload the file.
        httpMethod: 'POST',
        uploadType: FileSystem.FileSystemUploadType.MULTIPART,
        fieldName: 'file', // Name of the field for the uploaded file
        mimeType: 'audio/mpeg', // MIME type of the uploading file
        parameters: {
          // Optional: Any additional parameters you want to send with the file upload
          model: 'whisper-1', // For example, if you're using OpenAI's model parameter
        },
      }
    );

    console.log(JSON.stringify(response, null, 4));

digital.ventures · July 17, 2024, 10:07am

Signed up just to say thank you @RealBrubru.

rodox · August 6, 2024, 3:24am

you are my savior!!! thankssssssss

lfdepombo · September 3, 2024, 7:13pm

I ran into this problem too! For anyone else who ends up here like I did, I created an npm package called openai-react-native that handles file uploads with the Expo file system and chat completion streaming using React Native SSE, all while using the official Node SDK types and API wherever possible

Vamoss · November 1, 2024, 12:53am

This solve the problem:

        const audiofile = new File([audioBlob], 'audiofile', {
            type: 'audio/wav',
        });

        const transcription = await this.openai.audio.transcriptions.create({
            file: audiofile,
            model: "whisper-1",
            response_format: "text",
        });

This would be very welcomed in the documentation.

daledesilva · December 17, 2024, 3:47am

omg. You saved my bacon.
Thank you for this! Been looking for ages!

daledesilva · December 17, 2024, 3:47am

I’ve implemented RealBrudru’s solution, but keen to try this one too.
Thanks!

Topic		Replies	Views
Calling Whisper API using curl request keeps giving error API whisper	22	18179	February 6, 2024
Whisper expo-av recording: error 400 API api	3	675	August 12, 2024
Using Node.js library createTranscription() function without saving a file API	4	5485	August 5, 2024
Issues with audio files from IOS and the x-m4a format API whisper	14	2031	July 21, 2024
Creating Readstream from Audio Buffer for Whisper API API whisper	8	6118	January 14, 2025

Sending blob to Whisper api in React Native

Related topics