Whisper Segment Start Times

ElevenHeights · April 16, 2024, 7:10am

This is code I’m using to transcribe audio. For some reason the first segment received is always at 0.00 but the rest are on time. Is that how it’s supposed to work?

  # Transcribe the audio using OpenAI's Whisper API
  with open(audio_path, "rb") as audio_file:
      transcript = openai.audio.transcriptions.create(
          file=audio_file,
          model="whisper-1",
          response_format="verbose_json",
          timestamp_granularities=["segment"],
      )

robicobi · May 3, 2024, 2:49pm

I have the exact same issue. This causes subtitles to show before the speaker starts speaking.

Strangely, if you include "words" in your timestamp_granularities the first segment does start at the right time!

It would certainly make more sense if the timestamp is always correct.

Topic		Replies	Views
Discrepancy in segment level vs word level time stamps with whisper API API	0	95	May 4, 2024
Whisper api, not transcrip all audio API whisper	3	1453	October 28, 2023
How to get Whisper's API to add timestamps to the transcripts? API api , whisper	5	4617	January 29, 2024
How to transcribe long audio to srt file directly? API whisper	3	2560	December 16, 2023
Word-Level and Sentence-Level Transcript Timestamps Do Not Match Bugs whisper	0	171	April 4, 2024

Whisper Segment Start Times

Related Topics