How can I use the new whisper large-v3 model via API?

Hello,

I am using open-source Whisper with the large-v3 model. I would like to switch to OpenAI API, but found it only support v2 and I don’t know the name of the underlying model. https://platform.openai.com/docs/api-reference/audio/createTranscription

For the same audio file, the local large-v3 model works well but the API can not transcript it correctly. May I know how can I specify the model of the whisper API? Thanks!

1 Like

Docs say v2 is only one available now.

If you check the rate limits page, it seems to verify…

https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-five

Although naming v2 whisper-1 is a bit confusing! Haha…

I must’ve missed the announcement on v3… they’re usually good about announcements when it’s available, though, so I would stay tuned.

1 Like

yes, the API only supports v2. But if you download from github and run it on your local machine, you can use v3. Replicate also supports v3.

Not sure why OpenAI doesn’t provide the large-v3 model in the API.

2 Likes

Yeah, dunno. Sorry. Might be coming… or?

They’re going all in on multi-modal it seems, so I’m sure it’s likely in the works…