Sending files from S3 to Whisper API

Hi,

I am trying to use a Lambda function triggered on any S3 ObjectCreated event to send a file from S3 to the Whisper API, however, I am running into an invalid file format error:

BadRequestError: 400 Invalid file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']

I’m unsure how to resolve this error, could anyone point me in the right direction?

This is my handler function:

import { GetObjectCommand, S3Client } from '@aws-sdk/client-s3';
import { S3Handler } from 'aws-lambda';
import OpenAI, { toFile } from 'openai';

const client = new S3Client({});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const getTranscriptionHandler: S3Handler = async ({ Records }) => {
  try {
    if (!Records || Records.length < 1) {
      throw new Error('S3 event does not exist or does not contain data');
    }
    // For ease just grab the first thing
    const event = Records[0];
    // S3 data
    const { s3 } = event;
    const { bucket, object } = s3;
    console.log(object);
    const res = await client.send(
      new GetObjectCommand({
        Bucket: bucket.name,
        Key: object.key,
      })
    );
    const bodyStream = res.Body;

    if (!bodyStream) {
      throw new Error('S3 file has no content');
    }

    const translation = await openai.audio.translations.create(
      {
        file: await toFile(
          Buffer.from(await bodyStream.transformToString(), 'base64'),
          object.key,
          { type: res.ContentType }
        ),
        model: 'whisper-1',
      },
      {
        stream: true,
      }
    );
    console.log(translation);
  } catch (err) {
    console.error(err);
  }

  return;
};

export { getTranscriptionHandler as main };

Edit 1:
Some things I have tried:

  • Sending res.Body directly as file in openai.audio.translations.create — this resulted in the Lambda function running out of memory (if anyone knows why it does this I would be keen to know)
  • I don’t think setting stream: true did anything, if there is documentation on what these options do, please let me know
2 Likes

You basically need to download the file to /tmp inside your lambda, then send the file from /tmp to whisper.

4 Likes

That would involve downloading the file onto the disk space of the Lambda, correct? Would there be another way to do this in memory?

1 Like

Was fiddling around with it and simply specifying base64 in transformToString, I was able to successfully call the OpenAI API:

const translation = await openai.audio.translations.create({
      file: await toFile(
        Buffer.from(await bodyStream.transformToString('base64'), 'base64'),
        object.key,
        { type: res.ContentType }
      ),
      model: 'whisper-1',
    });
2 Likes

Cool. Yes you can also download to memory too and upload from there. In Python, I would use a BytesIO object.

The /tmp route is pretty straightforward, and Lambda now support 10GB in /tmp.

2 Likes