Sending blob to Whisper api in React Native

I am trying to use Whisper API in my react native application.

The code in the reference is:

import fs from "fs";
import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const transcription = await openai.audio.transcriptions.create({
    file: fs.createReadStream("/path/to/file/audio.mp3"),
    model: "whisper-1",
  });

  console.log(transcription.text);
}
main();

but there are 2 issues here:

  1. i can’t use fs in react native
  2. i don’t have a file but a blob saved in memory

how can i successfully perform the call?

thanks

stefano

Hi Stefano,
So there is a similar library react-native-fs that could be used. However it sounds like your main challenge is getting into a readable format. I don’t have a great answer about doing that beyond saving it to the file system in one of mp3 , mp4 , mpeg , mpga , m4a , wav , and webm and then pulling the newly created file. Can you get it in one of those formats to begin with?

Thanks for your quick reply.

I am using this library: expo-av
So when i record the audio what i get is a BLOB; if i run this on the device i can retrieve an mp3 file but on web i get a blob in the form: :http://localhost:8081/ccce98fa-4b07-46a1-954c-5c179cc5289a
i would like to use this blob directly in order to use Whisper API.

let me see if i can use react-native-fs then and / or save this blob into a proper audio format.

I almost got to work like this, but then I got an error: {“error”: {“code”: null, “message”: “Unrecognized file format. Supported formats: [‘flac’, ‘m4a’, ‘mp3’, ‘mp4’, ‘mpeg’, ‘mpga’, ‘oga’, ‘ogg’, ‘wav’, ‘webm’]”, “param”: null, “type”: “invalid_request_error”}}

and my file it is a m4a:
{“exists”: true, “isDirectory”: false, “md5”: “d5a5059ac69d33e95ccc5769f3e1474b”, “modificationTime”: 1712698814.99156, “size”: 109317, “uri”: “file:///var/mobile/Containers/Data/Application/5AA38A99-FDA3-4DE3-AF0D-5B17A826BB3B/Library/Caches/ExponentExperienceData/@anonymous/medAssistant-3d9b04d3-5b20-47d4-8338-f8ecc821c836/AV/recording-C907BDC1-7249-42D0-A568-F5F221FDD541.m4a”}

async convertSpeechToText(audioUri: string): Promise<string | null> {
try {
const info = await FileSystem.getInfoAsync(audioUri);
console.log(info);

  const audioBlob = await fetch(audioUri).then((r) => r.blob());
  const audiofile = new File([audioBlob], 'audiofile', {
    type: 'audio/m4a',
  });
  console.log('m4a');

  const formData = new FormData();
  formData.append('file', audiofile);
  formData.append('model', 'whisper-1');

  const response = await fetch(this.openAIEndpoint, {
    method: 'POST',
    body: formData,
    headers: {
      Authorization: `Bearer ${this.apiKey}`,
      'Content-Type': 'multipart/form-data',
    },
  });

  const transcription = await response.json();
  console.log(transcription);

  return transcription.text;
} catch (error) {
  console.error(error);
  return null;
}

}
}

I don’t know if this will help but when I was doing this with images and the vision api, the image had to have an online url. You couldn’t use the local image location. I put the images in an S3 Bucket and it solved it for me. Maybe try that here?

Oops!! I just found a solution… this worked for me:

  try {
    const response = await FileSystem.uploadAsync(
      this.openAIEndpoint,
      audioUri,
      {
        // Optional: Additional HTTP headers to send with the request.
        headers: {
          Authorization: `Bearer ${this.apiKey}`,
          // any other headers your endpoint requires
        },

        // Options specifying how to upload the file.
        httpMethod: 'POST',
        uploadType: FileSystem.FileSystemUploadType.MULTIPART,
        fieldName: 'file', // Name of the field for the uploaded file
        mimeType: 'audio/mpeg', // MIME type of the uploading file
        parameters: {
          // Optional: Any additional parameters you want to send with the file upload
          model: 'whisper-1', // For example, if you're using OpenAI's model parameter
        },
      }
    );

    console.log(JSON.stringify(response, null, 4));
4 Likes

Signed up just to say thank you @RealBrubru.

1 Like

you are my savior!!! thankssssssss :open_hands:

I ran into this problem too! For anyone else who ends up here like I did, I created an npm package called openai-react-native that handles file uploads with the Expo file system and chat completion streaming using React Native SSE, all while using the official Node SDK types and API wherever possible