Understanding `gpt-4o-mini-tts` Pricing (Input Characters → Cost)

:red_question_mark: How exactly is pricing calculated for gpt-4o-mini-tts?

I’m trying to estimate how much I’m going to pay given a specific number of input characters, but the pricing details seem a bit unclear.

From the docs:

  • Input: $0.60 / 1M characters
  • Output: $12.00 / 1M audio tokens
  • Estimated cost: $0.015 / minute (but this is clearly just an estimate)

My questions:

  1. What counts as input characters ? Is it the sum of instructions + input text ?
  2. If output pricing is based on audio tokens, what is the link between character count / audio tokens (given that 0.015$ is an estimate and not a real pricing)?

My goal is to be able to plug in a number of input characters and estimate the real cost before I scale up usage.

Any clarification would be super appreciated!