Workflow for generating full-text transcript and captions with Whisper

ric-ie · July 28, 2023, 2:06pm

I’m currently testing the waters with using whisper to generate full-text transcripts AND captions for videos. My workflow looks like this:

Use whisper locally to generate SRT captions (timestamped) and TXT transcript (no timestamp).
Manually edit necessary corrections with a colleague (Google Docs) in both the captions and transcript.
Publish captioned video (not forced / burned-in subs) + separate transcript document

The problem here is that I need to update both the SRT and TXT versions concurrently. But this workflow means duplicating this work in both files/documents.

Does whisper provide a workaround/feature like alignment for this? E.g.:

Generate TXT transcript,
Make corrections,
Generate SRT with corrections and timestamps according to original file?

Prior to whisper I’ve used either Otter.ai or Premiere Pro to generate transcripts, made corrections in a Google Doc (necessary in my use case for version control, use of Grammarly, sharing for review, etc.), and then used the alignment features in Google Drive or YouTube to turn the full-text transcript into captions.

It seems like a Google Apps script + the whisper API could be used to handle all of this, but curious what other solutions people have for this or overlooked features there are in whisper.

omeh2003_kaz · July 28, 2023, 6:33pm

If I understood you correctly, this can be successfully resolved on the side of Google Apps. You just need to replace the new text in App Script. With a script like this.

function updateSrtFile() {
  var doc = DocumentApp.getActiveDocument();
  var body = doc.getBody();
  var text = body.getText();
  var srtFile = DriveApp.getFileById('id вашего SRT-файла');
  var srtContent = srtFile.getBlob().getDataAsString();
  var lines = srtContent.split('\n');
  for (var i = 0; i < lines.length; i++) {
    if (lines[i].indexOf('текст для замены') !== -1) {
      lines[i] = lines[i].replace('текст для замены', text);
    }
  }
  srtContent = lines.join('\n');
  srtFile.setContent(srtContent);
}

Topic		Replies	Views
Web Speech API with whisper API whisper	1	502	July 24, 2025
How to get Whisper's API to add timestamps to the transcripts? API api , whisper	5	19434	January 29, 2024
How to transcribe long audio to srt file directly? API whisper	3	5191	December 16, 2023
Whisper API & Word-Level Time-stamping API whisper	6	20527	December 14, 2023
Questions regarding transcribing long audios (>25MB) in Whisper API API api , whisper	8	12226	December 15, 2023

Workflow for generating full-text transcript and captions with Whisper

Related topics