Looking to timestamp every second of the transcript. Is this possible natively? Thanks in forward!
At the moment, it is only possible to get timecodes within subtitle files (srt, vtt). If you want word alignment and timestamps, you would need to combine Whisper with some other alignment solutions - and as these models are built for each language separately, it complicates the integration a bit.