Self hosting Open Ai Whisper

Hi,

I wanted to ask if I can self host Open AI whisper on a Virtual machine lets say AWS’s EC2. Any documentation on that? What steps to follow here?

Thanks

Here is a topic that might help get you started on this:

Also, have a look at the setup instructions on github page:

1 Like

I dont have GPUs. I only have EC2 machine with reasonable specs for self hosting. Can we go with the conterization approach??? I didn’t actually get your above message

The python code solution I offered in the linked topic uses a distillation for English of “turbo” and can run in 2GB of GPU or 4GB of CPU, using the native whisper engine (which I offered because of the OP’s want to get out embeddings in that topic).

Running CPU-only on a 3.6GHz Xeon, it was 44 seconds for a 60 second file, with initial model loading.

“EC2” means little; there’s hundreds of instance types; you could even be on a slow ARM processor. Then, if you want to keep an API loaded and active, the hourly bill piles on.

Example:

Paying for Whisper on the API seems much more reasonable, unless you are making your own transcription service (in which case you find a coloc that doesn’t bill power, and jam a GPU in a 1U server).