How to transcribe two-person interview with Whisper API?

parakeet · September 8, 2023, 9:36am

I have successfully tested transcribing a video with the Whisper API (through Make, actually).

But it does not delineate respective speakers in the interview.

I triedissuing this prompt with the API request: “This is an interview. There is more than one speaker. Properly delineate interviewer and interviewee. Also use line breaks at appropriate points.” But it does nothing.

I’m exploring moving off Rev, which certainly does distinguish speakers within the video (Speaker 1, Speaker 2, etc).

I read that Whisper cannot yet distinguish speakers - is this correct?

Foxalabs · September 8, 2023, 9:41am

Correct, the current iteration of whisper is unable to differentiate between speakers.

david.lord.butler · December 21, 2023, 10:49pm

This is true, but there are open source tools that use Whisper that can do this. This is called speaker diarization. This is the search term you should use when looking for this functionality.

Topic		Replies	Views
How to identify different speakers using whisper? Community whisper	3	34085	November 2, 2023
Can Whisper distinguish two speakers? API whisper	9	43968	August 5, 2024
Whisper, how to tag different people in (sound) conversation API api	2	9203	June 8, 2023
Whisper parameters; separate person voices API whisper	1	2164	November 9, 2024
Audio File Trtanscription API	2	686	May 29, 2024

How to transcribe two-person interview with Whisper API?

Related topics