Whisper on Rapsberry pi 4 gives Segmentation fault

Maybe it is torch bug in whisper on Raspberry PI 4.

I have tried whisper on M1 Macbook Pro / VPS / Raspberry PI 4 machine.

On Macbook / VPS , whisper works fine.

But on Raspberry pi 4, it does not work.

Followings are the HW / SW spec of VPS machine.

  • 8G RAM, 4 vCPU
  • Debian GNU/Linux 12 (bookworm) , Python 3.11.2

Followings are the HW / SW spec of Raspberry PI 4 machine.

  • 8G RAM, 4 vCPU
  • Debian GNU/Linux 12 (bookworm) , Python 3.11.2
    ( all same with VPS )

VPS / Raspberry PI 4 has same spec, but it does not work on Raspberry pi 4 only.

My code is this.

import whisper
model = whisper.load_model("base") 
result = model.transcribe( "video.mp4")

‘transcribe’ method gives ‘Segmentation fault’ always on Raspberry Pi 4.

I guess it is bug of torch, but I am not sure.

I have tried to find the reason, but I could not find it.

Are there somebody who tried whisper on Raspberry pi 4?

I’d also be interested in learning how to debug this. I am running into the same issue, except I am running in an Ubuntu dockerfile on my M1 Macbook Air.

On the Mac (host OS), no issue:

>> whisper hello_world.wav 
/opt/homebrew/Cellar/openai-whisper/20231106/libexec/lib/python3.11/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: English
[00:00.000 --> 00:00.800]  Hello world.

In the Ubuntu 22.04 Docker container, the same command crashes.

# whisper hello_world.wav 
/usr/local/lib/python3.10/dist-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Segmentation fault

Is there a way to get more granular info on what’s going on?