I am using whisper to transcribe audio recordings. Sometimes, everything works well, and the transcription proceeds quickly, but at other times, the same audio file won’t start transcribing or transcribes very slowly until I restart the script execution or the command prompt or, even more often - the computer.

When I run the transcription command, I get no error, nothing, it seems to start running, but no transcription appears (or sometimes appears but very slowly, like 20 seconds of transcription in 5 min).

When I reload the computer, the transcription usually works well again.

I checked the GPU performance when I run the transcription command and at both times (when the transcription works and when it doesn’t), the dedicated GPU1 memory usage is 7.8 out of 8 GB while utilization is 0%.

When the transcription works well, the fans run quite loudly, but when it doesn’t, they don’t.

Here are the details:

  • GPU1 is NVIDIA GeForce RTX 4060,
  • GPU0 is Intel(R) Iris(R) Xe Graphics.
  • OS - Windows 11 Home.
  • processor - 12th Gen Intel(R) Core™ i5-12500H.
  • I have installed python 3.9.9, ffmpeg, and associated dependencies.

Has anyone had a similar issue?

P.S. I’m quite new to all this, so maybe there is something I do that is essentially wrong. So just in case, here is an example of how I would usually start:

python -m venv C:\Users\rusia\venv



whisper xxx.mp3 --model large-v2 --language xxx
Turning off “CUDA Sysmem Fallback” solved the issue.

