Whisper: problem with audio/mp4 blobs from Safari

forumUsr · August 14, 2023, 5:09pm

Reproduction steps:
Record audio/mp4, codecs=mp4a with Safari media recorder
Send it to whisper (I tried sending it through postman and node.js)

Results:
Recognition just cuts off at 1 to 3 words every time even in small recordings (under 10 seconds)

Expected resullts:
Recognition goes up to 25 megabytes of data as Whisper stated

Additional testing:
Playback of sample files (VLC/Apple Music) +
FFmpeg conversion of sample mp4 files to mp3/AAC/ogg(opus) +

I will make a temporal crutch in the backend of the application to FFmpeg files to other codec and format. It’s very frustrating when the product team claims speech recognition functionality in m4a as well as in mp4 formats, but it’s broken. By the way, it was working fine before.

keizo · August 23, 2023, 12:49am

I think I had this problem, mediaRecorder.start(1000)

I posted in other long thread, won’t let me link it here

EarthLaunch · September 21, 2023, 10:58am

mediaRecorder.start(1000) worked for me on MacOS 13.3.1, Safari 16.4. Thank you so much @keizo.

jonathanmv · November 19, 2023, 6:22pm

[UPDATE: the solution did work for me. The text below is left for reference]

Hi all,

The mediaRecorder.start(1000) solution didn’t work for me. Not on MacOS Safari and not on Chrome on iOS.

I ended up using this 8-year-old Recorderjs* library. It returns the blob in wav format, which Whisper handles well.

Because I can’t include links, please search for mattdiamond/Recorderjs on GitHub.

jonathanmv · November 23, 2023, 9:42am

My bad. The mediaRecorder.start(1000) solution did work. The problem was on the app side, not on the model.

tomkat_cr · December 30, 2023, 1:51am

It worked for me too. I tried other solutions (e.g. using mic-recorder-to-mp3 as described in community[dot]openai[dot]com/whisper-api-only-transcribing-first-few-seconds/457663/7) and mediaRecorder.start(1000) and a little prompt to whisper made my day.

michael33 · January 26, 2024, 3:37pm

Implementing the solution: 1 Second

Finding the right words to Google:

ssavanovic28 · February 8, 2024, 12:07am

Yes this is like the 30th article i have read that was related to my question, and I found the article ID from a different article I barely found

us74402 · April 7, 2024, 6:08am

Hi everyone,

Does anyone have an idea on how to record audio in iOS iPhone PWA standalone mode? I’m facing an issue where my PWA works well on Safari, but when I add it to the home screen in standalone mode, it asks for permission to access audio. I grant permission, but the audio is not recorded; it just produces a beep sound. Does anyone know how to fix this?

Thanks,
Usama

romain · April 27, 2024, 3:09pm

that worked for me as well. Thank you so much. I took one hour to try other solutions and I found this one which was an immediate fix. Cheers!

paul27000 · August 27, 2024, 3:21pm

But this will record only one second or am I wrong ?

keizo · August 27, 2024, 8:14pm

No, I think it has to do with how the audio is chunked.

alvinvogelzang · November 28, 2024, 2:30pm

@keizo is right. This is the number of ms to record in each Blob. By default the total recording is places in 1 large Blob. By setting it to 1000 it will be chunked into Blobs of 1000ms each.

Topic		Replies	Views
Whisper api completely wrong for mp4 API whisper	14	5225	December 15, 2023
Whisper API rejecting MP4 from safari - but works with webm on chrome and edge Bugs whisper	2	937	February 12, 2024
MediaRecorder API w/ Whisper not working on mobile browsers API whisper , as-wiki	7	1805	December 20, 2024
Whisper API not transcribing audio files coming from an iphone API ios , whisper , javascript	10	2382	December 18, 2024
Whisper API only transcribing first few seconds API whisper	7	3285	December 19, 2023

Whisper: problem with audio/mp4 blobs from Safari

Related topics