I need to run Whisper in linux, on my own gpu. I’ve found it quite useful on my Debian laptop, but without a gpu it just kills the battery. I have a 4U server in colocation with some empty pcie slots running CentOS 8 (yuck). I’d be happy to reload that with Debian if necessary but I need to keep it GNU+Linux.
What is a good GPU recommendation I should get? I don’t think I want to get the latest and greatest gpu. I’d like to get something a generation or two old, used, to save money and reduce the e-waste in this world. My CPU is a Nehalem as well, if that is important. (I saw a warning about rocm not supporting cpus prior to Haswell.)
Basic rule of thumb is more VRAM more better, a 2080 would be a powerhouse with plenty of CUDA cores to take advantage of, although even a 2060 or 3060 will typically have enough VRAM for Whisper and be 10’s-100’s of times faster than a CPU solution.
“My CPU is a Nehalem” = 1st gen Core CPU, vs 13th gen now. Limiting factor.
I have 3.6Ghz Nehalem->Westmere Xeon workstation with 48GB. Spec’d at what would be over $10000 or impossible when new, it kept up with new CPUs for a long time, but then now considering CPU inference, it’s becoming laughable compared to the cheapest $120 Intel offering.
Fortunately whisper dumps out words at a massive rate compared to language models.
Maybe the most promising investment is the 4060 Ti 16GB that should be out soon, slow, but cheaper and with the memory to support larger models for the patient.
Tesla P40 24GB, used, is an option for local models, but I can find no benchmark or proof-of-concept of someone running whisper on one.