Wondering what the state of the art is for diarization using Whisper, or if OpenAI has revealed any plans for native implementations in the pipeline. I’ve found some that can run locally, but ideally I’d still be able to use the API for speed and convenience.
Google Cloud Speech-to-Text has built-in diarization, but I’d rather keep my tech stack all OpenAI if I can, and believe Whisper is better regardless.