Whisper API costs 10x more than hosting an VM?

Whatever the model, the API consumption does not correspond.

Hi @anon34024923 ! I’m curious about the “multiplied by 20”? Do you mean that you can run 20 parallel inference jobs on a single 4090 for $0.74/h?

1 Like

Whisper is so light on resources, I just run it locally.

I run a threadripper 16 core with cheap RTX 3060 – and I can generate amazingly accurate subtitles for an entire movie in a couple minutes.

Did you ever go and try it out?

I just signed up for runpod and tried both. The ‘serverless’ option, which is the only one that is truly ‘by the second’ works the way you describe only in theory I think. You can create requests - that have a delayed (can’t really predict) start and then run. All works fine, your really not owning these GPU’s in that you can schedule to have those parallel things happen. You don’t have much control at all. So I think its a perfect solution for running tasks where time passed doesn’t matter much and parallel even less. Anything else you’d have to get a ‘server’ - and those are all by the hour. So you pay regardless of use.

So I would say the by the second serverless is more a ‘dev mode’ or ‘rarely used’ setup but the math that we looked at does NOT apply IMO.
Curious what experiences others have running stuff on runpod

1 Like