Hello everyone,
I’m currently working on integrating OpenAI’s Whisper API into a C# application to transcribe audio files directly from S3 storage, without writing them to disk. Despite setting the MIME type correctly and ensuring the file name is included in the multipart form-data, I consistently receive an error stating “Unrecognized file format.”
Background: The goal is to stream audio files stored in S3 directly to the Whisper API without creating a temporary file on the server. This approach should help maintain efficiency and security by avoiding disk I/O.
Code Snippet: Here is the core method I am using:
public async Task<string> SpeechToTextAsync(string audioFileName, string model = "whisper-1")
{
var buffer = new MemoryStream();
await _s3Service.DownloadFileToBufferAsync(audioFileName, buffer);
buffer.Position = 0;
var content = new MultipartFormDataContent();
var fileContent = new StreamContent(buffer);
fileContent.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue(GetMimeType(audioFileName));
content.Add(fileContent, "file", audioFileName);
content.Add(new StringContent(model), "model");
var response = await _httpClient.PostAsync(
"https://api.openai.com/v1/audio/transcriptions",
content);
var result = await response.Content.ReadAsStringAsync();
if (!response.IsSuccessStatusCode)
{
return $"API call failed: {result}";
}
var transcription = JsonConvert.DeserializeObject<dynamic>(result);
return transcription.text.ToString();
}
private static string GetMimeType(string fileName)
{
var extension = Path.GetExtension(fileName).ToLowerInvariant();
return extension switch
{
".mp3" => "audio/mpeg",
".wav" => "audio/wav",
".ogg" => "audio/ogg",
// Add other cases as necessary
_ => "application/octet-stream"
};
}
Issue: Every attempt results in a response error: { "error": { "message": "Invalid file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']", "type": "invalid_request_error" } }
.
This error occurs despite confirming that the MIME type is correctly set and that the file name is correctly passed in the Content-Disposition. I suspect the issue might be with how the MemoryStream
is handled or perceived by the API.
Attempts to Resolve:
- Checked MIME type setting - it’s correctly mapped based on the file extension.
- Ensured the
Content-Disposition
is correctly setting the file name. - Reviewed similar issues in other languages (e.g., Python implementations using
io.BytesIO
), which hinted at similar problems but didn’t directly translate to a solution in C#.
Questions:
- Has anyone faced a similar issue with
MemoryStream
or in-memory file handling when interacting with APIs expecting file uploads? - Are there nuances with
MultipartFormDataContent
in .NET that might cause the file format to be unrecognized even with correct MIME types? - Any suggestions on modifications or diagnostics tools to better understand how the API is interpreting the received data?
Thank you for any insights or suggestions you might offer. This issue has been a significant blocker, and any help would be greatly appreciated!