I got around this issue by setting the .name
attribute on the buffer object. It appears that the Whisper API is inferring the file type from the extension on this attribute, rather than inspecting the raw bytes themselves. Here’s a snippet that worked for me (I’m using GraphQL with multipart file uploads).
@strawberry.type
class Mutation:
@strawberry.mutation
async def transcribe(self, audio_file: Upload) -> str:
audio_data = await audio_file.read()
buffer = io.BytesIO(audio_data)
buffer.name = "file.mp3" # this is the important line
transcription = await openai_client.audio.transcriptions.create(
model="whisper-1",
file=buffer,
)
return transcription.text
Looking at the types in the Python SDK, it looks as though as you can pass a bytes object to the file
argument, but I haven’t gotten this to work.
Thanks to @ahmed.alsaba for pointing me toward the right post.