How do I calculate the usage cost when using the GPT-4o-mini-TTS model?

I’m using the GPT-4o-mini-TTS model, but since the response doesn’t return usage data, I have no way of knowing the number of input or output tokens.

For the moment, the speech API doesn’t return usage information.

But its costs are aproximately: 1k input characters ~= 1 minute ~= $0.015 (for english).

If you need exact usage, an alternative is using the gpt-4o-mini-audio-preview, which returns more detailed usage.

You can add some system prompt to make it behave like a TTS endpoint:

Echo the exact text sent to you in the user prompt, with no extra responses

Try it in the playground for more details.

1 Like

And I forgot to mention, in the usage dashboard if you export csv data you can see the sumarized input tokens, but not for individual requests.

1 Like

I’m not always able to access the usage dashboard, so it’s difficult for me to keep track of detailed usage information. I’ve reviewed the pricing and found the following:

  • GPT-4o Mini TTS:
  • $0.60 per 1M input tokens
  • $12.00 per 1M output tokens, or $0.015 per minute

Given this, I’d like to confirm: how many input tokens am I actually sending?

instructions+input text (you can have a rough idea at tokenizer)

But the text tokens are almost negligible, you pay mostly for the generated audio in length.

I understand now. Thank you for your assistance

1 Like