Audio Model Pricing is Unclear

GoldenJoe · March 22, 2025, 10:39pm

According to the documentation, Whisper is $0.006/minute while TTS is $15.00 for 1M characters. It would be nice if there was an estimated cost per minute like the new 4o-mini audio models have so the current users of Whisper have a better idea of potential cost savings.

Also, aside from the prompt, what else can contribute toward the input token cost of 4o-transcribe and 4o-mini-transcribe?

aprendendo.next · March 22, 2025, 10:51pm

I am a bit confused too. It seems to use a concept of audio tokens, not directly relatable into “minutes”. I couldn’t find any further information though, but in the API output it will tell you how many tokens were consumed.

What I can say is that summing up it all it is very low cost, you can check on your usage dashboard.

Basically, for TTS you have the prompt for instructions, which follow the usual text token measure, plus the audio tokens for the generated audio.

_j · March 22, 2025, 11:35pm

The graphic seems pretty complete.

Whisper-1 is only billed per-minute, exactly. You send fast-talking micromachines or droll Prairie Home Companion, you get the same cost.

The last column IS the estimated cost of operation of the modality transformation gpt models under discussion.

You’d certainly be able to tack on extra expense if you had maximum prompting for voice tone not spoken.

Topic		Replies	Views
WebRTC gpt-4o-audio cost per minute of conversation? API gpt-4o-audio-preview	2	1477	March 11, 2025
Confusion Between Per-Minute Audio Pricing vs. Token-Based Audio Pricing API realtime	3	8203	December 30, 2024
New TTS API pricing and gotchas API	8	3415	March 25, 2025
I don't understand the pricing for the realtime API API realtime	35	20413	August 12, 2025
Gpt-4o-mini-tts output cost has exceeded what I calculated API tts	1	397	July 10, 2025

Audio Model Pricing is Unclear

Related topics