Does audio file size have any impact on Whisper performance?

As long as the audio quality is still decent enough to be easily understood, should I consider compressing audio files before submitting to Whisper?

Hi @canman

You should consider compressing as long as it doesn’t affect the quality, only if your audio is larger than 25 MB. Here’s a snippet from docs:

By default, the Whisper API only supports files that are less than 25 MB. If you have an audio file that is longer than that, you will need to break it up into chunks of 25 MB’s or less or used a compressed audio format. To get the best performance, we suggest that you avoid breaking the audio up mid-sentence as this may cause some context to be lost.

Right, but I guess there’s no real difference between submitting (for example) a 17MB mp3 or the same file at 10MB that’s been compressed further by ffmpeg. Again, assuming that understandability hasn’t been degraded in compression.

Again if it doesn’t affect the quality, and saves you network costs, go for it.

1 Like