Whisper expo-av recording: error 400

I performed a recording with expo-av and on my device i get the following uri as result:
file:///var/mobile/Containers/Data/Application/5DCEDD6B-24B5-4A69-92E8-1FB910ABFD7D/Library/Caches/ExponentExperienceData/@anonymous/test_stack-9c4d24a8-f364-45ab-aa8f-a9e42009003a/AV/recording-30CCB12D-40F2-44D7-B9E3-8DB1389F4E8F.m4a

i am trying to send the file to whisper api in order to get the transcription and here’s what i am doing:

const audioBuffer = await FileSystem.readAsStringAsync(uri, {
        encoding: FileSystem.EncodingType.Base64,
});

const form = new FormData();
form.append('file', audioBuffer);
form.append("model", "whisper-1");

const response = await axios.post(
            'https://api.openai.com/v1/audio/transcriptions', 
            form, {
              headers: {
                  'Authorization': `Bearer ${apiKey}`,
                  'Content-Type': 'multipart/form-data',
              },
            }
);

but i get a general error 400 with no additional information.

i tried also the send the uri as it is in the file but same issue.

If i use the OpenAI import and execute the code:

const openai = new OpenAI(
    {apiKey: process.env.EXPO_PUBLIC_OPENAI_API_KEY,
      }
  );

const audioBuffer = await FileSystem.readAsStringAsync(uri, {
        encoding: FileSystem.EncodingType.Base64,
      });
        
      const transcription = await openai.audio.transcriptions.create({
        file: audioBuffer,
        model: "whisper-1",
      });

i got instead the error:

Error: [Error: 400 1 validation error for Request
body -> file
  Expected UploadFile, received: <class 'str'> (type=value_error)]

any idea?

thanks

stefano

1 Like

I have the same issue with uploads to the API. So far the best I can do is to get a TypeError from the API : Error during transcription: [TypeError: Received null for "file[_lastStatusUpdateTime]"; to pass null in FormData, you must use the string 'null'] … Any ideas, as I have the local URI, and should be able to send this to the API according to the expo-av documentation…

you might not need to read it in a buffer. here is my code but i am using fetch:

const handleTranscribe = React.useCallback(async () => {

        const uri = audioFile // file:///var/mobile/Containers/Data/Application/.../AV/recording-F3C6D204-D82C-4624-A609-4A0CE977217D.m4a

        const parts = uri.split('/')
        const filename = parts[parts.length - 1]

        let formData = new FormData()
        formData.append('model', 'whisper-1')
        formData.append('file', {
            uri: uri,
            type: 'audio/mp4',
            name: filename,
        })

        fetch('https://api.openai.com/v1/audio/transcriptions', {
            method: 'POST',
            headers: {
              'Authorization': `Bearer ${process.env.EXPO_PUBLIC_OPENAI_API_KEY}`,
              'Content-Type': 'multipart/form-data'
            },
            body: formData,
        })
        .then((response) => response.json())
        .then((data) => {

            console.log(data)

        })
        .catch((error) => {
            console.error(error.message)
        })

    }, [audioFile])

Just signed up to give my code x)
(I’m noob but hope this helps)

import { StatusBar } from ‘expo-status-bar’;
import { StyleSheet, View, Button } from ‘react-native’;
import { Audio } from ‘expo-av’;
import * as FileSystem from ‘expo-file-system’;
import * as Sharing from ‘expo-sharing’;
import { useState } from ‘react’;

export default function App() {
const [recording, setRecording] = useState(null);
const [sound, setSound] = useState(null);
const [recordingUri, setRecordingUri] = useState(null);

const startRecording = async () => {
try {
const { status } = await Audio.requestPermissionsAsync();
if (status !== ‘granted’) {
console.log(‘Permission to record audio denied’);
return;
}

  const newRecording = new Audio.Recording();
  await newRecording.prepareToRecordAsync({
    android: {
      extension: '.mp4', // Enregistrer en MP4
      outputFormat: 2,
      audioEncoder: 3,
      sampleRate: 8000,
      numberOfChannels: 1,
      bitRate: 128000,
    },
    ios: {
      extension: '.m4a',
      audioQuality: Audio.RECORDING_OPTION_IOS_AUDIO_QUALITY_HIGH,
      sampleRate: 8000,
      numberOfChannels: 1,
      bitRate: 128000,
      linearPCMBitDepth: 16,
      linearPCMIsBigEndian: false,
      linearPCMIsFloat: false,
    },
  });
  await newRecording.startAsync();
  setRecording(newRecording);
  console.log('Recording started');
} catch (error) {
  console.log('Error starting recording:', error);
}

};

const stopRecording = async () => {
try {
await recording.stopAndUnloadAsync();
const uri = recording.getURI();
console.log(‘Recording stopped and stored at’, uri);
setRecording(null);
setRecordingUri(uri);

  const { sound: newSound } = await recording.createNewLoadedSoundAsync();
  setSound(newSound);
} catch (error) {
  console.log('Error stopping recording:', error);
}

};

const playRecording = async () => {
try {
if (sound) {
await sound.replayAsync();
}
} catch (error) {
console.log(‘Error playing sound:’, error);
}
};

const downloadRecording = async () => {
try {
if (recordingUri) {
await Sharing.shareAsync(recordingUri);
}
} catch (error) {
console.log(‘Error sharing recording:’, error);
}
};

return (







);
}

const styles = StyleSheet.create({
container: {
flex: 1,
backgroundColor: ‘#fff’,
alignItems: ‘center’,
justifyContent: ‘center’,
},
});

Explanations :
Whisper looks to not like the default encoder from expo-av, i also tried to manually write the header of files but didn’t work for me

With this code you get correct mp4 files to send to whisper :slight_smile:
Enjoy my 18 hours lost (i only like Python and had to learn JS in 2 days, i will get a shower now !)

PS:
outputFormat: 2,
audioEncoder: 3,

stand for mp4 and the encoder AAC,
Check : ht tps://docs.expo.dev/versions/latest/sdk/audio/
(watch out there is a space after ht in https)