Even with the previous solutions mentioned in this thread, none of them worked for my environment in nextjs using the App Router. But now finally after some trial and error I figured it out and thought I’d post my solution here as well for others who still struggle.
Using the vanilla safari MediaRecorder
api worked to record audio/mp4
blobs, but sending them to the whisper API always gave me transcripts like Hello
or Thank You
or Bye.
, no matter what the content of the recording was. That’s even after @michellep posted that the backend was updated.
Using mediaRecorder.start(1000)
didn’t work for me, it would just upload the recorded blob after each second of recording. This aligns with the Mozilla docs. It’s actually a mystery to me how other people made it work with that setting.
I also tried recordRTC.js
and got that working eventually using audio/wav
but wasn’t satisfied with this solution as the blobs are way larger than with audio/webm
or audio/mpeg
.
Soltuion that worked for me
Other solutions above mentioned audio-recorder-polyfil
, which is hard to use in a nextjs app router environment due to the server side rendering by default. Even use client
wouldn’t do the trick as it usually does. But now finally I found what I had to do to make it work:
In the parent component that needs the recording button, I’m importing the recording button component like so:
# MyComponent.tsx
import dynamic from "next/dynamic"
import React from "react"
export default function MyComponent() {
const RecordingButton = dynamic(() => import("./RecordingButton"), { ssr: false })
return (
<div>
// other stuff
<RecordingButton />
</div>
)
}
And then inside the RecordingButton
component I’m only importing the polyfil if audio/webm
isn’t supported by the browser:
# RecordingButton.tsx
const supportsWebm = typeof MediaRecorder !== "undefined" && MediaRecorder.isTypeSupported("audio/webm")
if (!supportsWebm) {
// Dynamically import the polyfill if 'audio/webm' is not supported
Promise.all([import("audio-recorder-polyfill"), import("audio-recorder-polyfill/mpeg-encoder")])
.then(([AudioRecorderModule, mpegEncoderModule]) => {
const AudioRecorder = AudioRecorderModule.default
const mpegEncoder = mpegEncoderModule.default
AudioRecorder.encoder = mpegEncoder
AudioRecorder.prototype.mimeType = "audio/mpeg"
window.MediaRecorder = AudioRecorder
})
.catch((error) => {
console.error("Error importing polyfill:", error)
})
}
After that I was able to just use the regular browser MediaStream Recording API (you can just ask ChatGPT how to use that from here on).
I like this solution best, because I still get to use compressed formats and don’t have to use .wav and also I can just use the regular MediaStream API.
Ps.: Unfortunately had to exclude all links to docs and libraries. Would be nice if links would be enabled to make higher quality posts.