Whisper error 400 "Unrecognized file format."

Hello.
Transcription works fine when I use a file as a source. However, I do not want to rely on disk storage when I need to transcribe an audio segment. The following code works fine:

audio = AudioSegment.from_file(audio_path_filename_ext,format="mp3")
segment = audio[start_time:end_time]
segment.export(temp_path_filename_ext)
   with open(temp_path_filename_ext,'rb') as audio_file:
     transcription = self.__client.audio.transcriptions.create(
                        model=config.chatgpt_voice_recognition_model,
                        file=audio_file,
                        language=config.language,
                        temperature=0,
                    )

However, when I attempted to use io.BytesIO() , I encountered an error.

audio = AudioSegment.from_file(audio_path_filename_ext)
segment = audio[start_time:end_time]
memory_file = io.BytesIO()
segment.export(memory_file,format="mp3")
memory_file.seek(0)
transcription = self.__client.audio.transcriptions.create(
                    model=config.chatgpt_voice_recognition_model,
                    file=memory_file,
                    language=config.language,
                    temperature=0,
                )
 Error code: 400 - {'error': {'message': "Unrecognized file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']", 'type': 'invalid_request_error', 'param': None, 'code': None}}

Why?

Hi. I am experiencing the same problem, but only on mobile. Did you ever figure out the solution?

PS. Just realized that your question was posted about 2 hours ago :man_facepalming:

I believe the only way is to attempt to dump the data sent to OpenAI from both pieces of code. Then, analyze the differences and create a custom class to correct the data being sent to OpenAI. When I have free time, I will try to accomplish this.

This solution worked for me

https: / /community.openai.com /t/ openai-whisper-send-bytes-python-instead-of-filename/84786/4

Added spaces because my account is new and I can’t include links in my post

2 Likes

It worked! What I didn’t know was that I should specify the audio format using:

buffer.name='tmp.mp3'

Thanks.

1 Like

@andreydmitr20

Is it OK if a moderator closes this topic?

Hi. Here is what worked for me. I’m using Node (NestJS) with the openai npm lib.

import { Readable } from 'stream'
import { Injectable } from '@nestjs/common'
import * as openai from 'openai'
import { toFile } from 'openai'
import { getObjectFromS3 } from '../../aws/s3/s3-client'
import { readableToBuffer } from '../../../commons/utils'

@Injectable()
export class WhisperService {
  private readonly openaiClient: openai.OpenAI

  private readonly transcriptionsAPI

  constructor() {
    this.openaiClient = new openai.OpenAI({ apiKey: process.env.OPENAI_KEY })
    this.transcriptionsAPI = this.openaiClient.audio.transcriptions
  }

  async transcribeAudio(audioDocPath: string): Promise<string | null> {
    const s3Object = await getObjectFromS3(audioDocPath)
    const buffer = await readableToBuffer(s3Object.Body as Readable)
    const file = await toFile(buffer, 'tmp.mp3', { type: 'mp3' })
    const transcriptionConfig = { timeout: 40000, maxRetries: 2 }

    const response = await this.transcriptionsAPI.create({ model: 'whisper-1', file }, transcriptionConfig)

    return response.text
  }
}

the readableToBuffer function:

export async function readableToBuffer(readable: Readable): Promise<Buffer> {
  const chunks: Buffer[] = []

  return new Promise<Buffer>((resolve, reject) => {
    readable.on('data', (chunk: Buffer) => chunks.push(chunk))
    readable.on('end', () => resolve(Buffer.concat(chunks)))
    readable.on('error', reject)
  })
}
1 Like

Initially, I was doing this:

 const file = await toFile(Buffer.from(data));

Somehow, I managed to make it work without the library, and then I asked myself: WHY?!

Here is the reason:

 const file = await toFile(Buffer.from(data), 'audio.mp3');

This worked for both the node.js library and a pure fetch request.

I’m not really sure why, but I believe it has to do with the “name” inference in the “toFile()” method:

@param name — the name of the file. If omitted, toFile will try to determine a file name from bits if possible.

Note it says Try…

When I manually added the name and type, it worked properly. However, when I removed it, I started getting nonstop errors.

1 Like

I got around this issue by setting the .name attribute on the buffer object. It appears that the Whisper API is inferring the file type from the extension on this attribute, rather than inspecting the raw bytes themselves. Here’s a snippet that worked for me (I’m using GraphQL with multipart file uploads).

@strawberry.type
class Mutation:
    @strawberry.mutation
    async def transcribe(self, audio_file: Upload) -> str:
        audio_data = await audio_file.read()
        buffer = io.BytesIO(audio_data)
        buffer.name = "file.mp3"  # this is the important line
        transcription = await openai_client.audio.transcriptions.create(
            model="whisper-1",
            file=buffer,
        )
        return transcription.text

Looking at the types in the Python SDK, it looks as though as you can pass a bytes object to the file argument, but I haven’t gotten this to work.

Thanks to @ahmed.alsaba for pointing me toward the right post.