Hi everyone,
I’m not from the IT field (this is actually my very first project in programming), but I’ve been trying to put together a simple pipeline using Whisper and speaker diarization with some help from AI tools.
The goal is pretty basic: I just want to transcribe telesales calls in Portuguese and separate the speakers automatically. I managed to combine Whisper with pyannote’s diarization model, but I’m hitting some issues:
-
Sometimes the diarization only detects one speaker, even though there are clearly two.
-
I’m not sure if I’m using the best approach to align diarization segments with Whisper segments.
Since I’m a beginner, I might be missing something obvious. Could anyone help me with this issues?
Thanks a lot for your patience and help 
If anyone needs it, my code is located here: https://github.com/pfcout/whisper_transcription
Paulo,
I would install codex and let hit write this code for you. Your repo is a good start but this is a sophisticated project… good luck either way but I would embrace Codex CLI and let it do the heavy lifting with you as tester.
If you want to stick to your current coding methodology, consider whisperx (look on github, not affiliated with me) because it is nicely integrated with whisper.
If you’re staying the course, some more targeted suggestions…
-
have you tuned Pyannote VAD parameters? this will affect speaker ID
-
you force the speaker count to 2 but there may be more, correct? use min_speakers=2, max_speakers=4 or something like that on the diar_model call
-
you have some odd logic in there when only one speaker is detected. I think you’re trying to recover but seems strange to assume speakers alternate? Probably not the issue now.
-
there are a lot of settings you can add by passing a pipeline config to from_pretrained
pipeline_config = {
"segmentation": {
"min_duration_on": 0.1, # minimum speech duration
"min_duration_off": 0.1, # minimum silence duration
},
"clustering": {
"method": "centroid",
"min_cluster_size": 15,
"threshold": 0.7, # adjust based on your speakers
}
}
1 Like