I am trying to use a Lambda function triggered on any S3 ObjectCreated event to send a file from S3 to the Whisper API, however, I am running into an invalid file format error:

BadRequestError: 400 Invalid file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']

I’m unsure how to resolve this error, could anyone point me in the right direction?

This is my handler function:

import { GetObjectCommand, S3Client } from '@aws-sdk/client-s3';
import { S3Handler } from 'aws-lambda';
import OpenAI, { toFile } from 'openai';

const client = new S3Client({});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const getTranscriptionHandler: S3Handler = async ({ Records }) => {
  try {
    if (!Records || Records.length < 1) {
      throw new Error('S3 event does not exist or does not contain data');
    // For ease just grab the first thing
    const event = Records[0];
    // S3 data
    const { s3 } = event;
    const { bucket, object } = s3;
    const res = await client.send(
      new GetObjectCommand({
        Key: object.key,
    const bodyStream = res.Body;

    if (!bodyStream) {
      throw new Error('S3 file has no content');

    const translation = await
        file: await toFile(
          Buffer.from(await bodyStream.transformToString(), 'base64'),
          { type: res.ContentType }
        model: 'whisper-1',
        stream: true,
  } catch (err) {


export { getTranscriptionHandler as main };

Edit 1:
Some things I have tried:

  • Sending res.Body directly as file in — this resulted in the Lambda function running out of memory (if anyone knows why it does this I would be keen to know)
  • I don’t think setting stream: true did anything, if there is documentation on what these options do, please let me know

You basically need to download the file to /tmp inside your lambda, then send the file from /tmp to whisper.


That would involve downloading the file onto the disk space of the Lambda, correct? Would there be another way to do this in memory?

Was fiddling around with it and simply specifying base64 in transformToString, I was able to successfully call the OpenAI API:

const translation = await{
      file: await toFile(
        Buffer.from(await bodyStream.transformToString('base64'), 'base64'),
        { type: res.ContentType }
      model: 'whisper-1',

Cool. Yes you can also download to memory too and upload from there. In Python, I would use a BytesIO object.

The /tmp route is pretty straightforward, and Lambda now support 10GB in /tmp.