WhisperAI API Not Recognizing Valid File Formats

So I am creating a simple website that takes in phone recording and transcribes them with WhisperAI API. I ran into this peculiar trouble where m4a audio files of phone recordings are not recognized as m4a by WhisperAI
error 400 “Invalid file format. Supported formats: [‘m4a’, ‘mp3’, ‘webm’, ‘mp4’, ‘mpga’, ‘wav’, ‘mpeg’]”

Stranger yet, if I convert the file to mp4 (or any other files) and convert it back to m4a, it then recognizes it as a valid file.

Here are some of my theories:

  1. The website uses Nextjs framework and axios, which causes audio file to be corrupted when sending the file directly from front end
    counter: but why did it work after converting it back to m4a? Or why didn’t my function for filtering out invalid file formats not work?
    const { getRootProps, getInputProps } = useDropzone({
    accept: ‘audio/mpeg, audio/mp3, audio/mp4, audio/m4a, audio/wav, audio/webm’,
    onDrop: (acceptedFiles) => {
    if (acceptedFiles.length > 0) {
    const acceptedTypes = [
    const file = acceptedFiles[0];
    if (acceptedTypes.includes(file.type)) {
    try {
    // send the file to the API
    } catch (error) {
    console.error(‘Error setting audio file:’, error);
    setError(‘Error setting audio file: ’ + error.message);
    } else {
    console.error(‘Invalid file type:’, file.type);
    setError(‘Invalid file type: ’ + file.type);
    } else {
    console.error(‘No file selected’);
    setError(‘No file selected’);
  2. the recording done by the call-recording application (Korean application called 후후 통화녹음) is problematic. Which I’m not sure how I could check it out since I have zero knowledge on how the audio files are supposed to work.

I’m not sure how to proceed from here so I would really appreciate any sort of feedbacks. Thanks in advance!

1 Like

I am encountering the same problem. Curiously, this actually works at least a month ago. But trying it again today somehow it no longer works as expected. Which means OpenAI updated something. I also did what you tried (converting to mp3 then back to m4a). I noticed that in the converted m4a output, duration is now displayed correctly when played in audio player. But when directly saved from Web Audio API, this is missing. So perhaps this is related to moov atom issue in m4a audio files. I am still trying to find ways to solve this.

1 Like

Hm. I will also have to look into that too. But based on your response, at least now I know its something specifically related to m4a and openai. Thank you for sharing! Let me know if you have any lead and I’ll keep you updated on my side.

In my case, I’m generating the audio from a base64 encoded string, and manually setting the filetype to be ogg, but the API responds back that it does not understand the audio format:

  const buffer = Buffer.from(media.data, 'base64')
  const audio = await toFile(buffer, media.filename, { type: 'ogg' })

Is that the expected mime type?

1 Like

As far as I know, ogg is not supported.

following input file types are supported: mp3 , mp4 , mpeg , mpga , m4a , wav , and webm .