Objectively tracking price/token usage in v1/audio/speech and v1/audio/transcriptions?

elton.ogoshi · May 21, 2025, 11:18pm

Objectively tracking price/token usage with v1/audio/speech and v1/audio/transcriptions?

I need to benchmark the cost comparison between OpenAI’s gpt-4o-realtime-preview and a chained solution (e.g., gpt-4o transcription → gpt-4o → gpt-4o-mini TTS) for a specific application. However, I’m struggling to get objective, request-level usage data from OpenAI’s audio endpoints.

What works well

Chat completions: The usage field in v1/chat/completions responses provides exact token counts
Realtime API: Usage data is available in the response.done event

The problem

The audio endpoints don’t provide (any) granular usage reporting:

v1/audio/transcriptions: No usage field indicating how many input tokens gpt-4o-transcribe and gpt-4o-mini-transcribe received.
v1/audio/speech: No usage field showing how many audio tokens were generated.

This makes it impossible to track costs at the individual request level, which I need for accurate benchmarking.

Constraints

I don’t have access to my organization’s usage dashboard
Even if I did, dashboard data doesn’t provide request-level granularity needed for this comparison

Is there something I’m missing in the API responses, or is this a known limitation that will be corrected?

aprendendo.next · May 22, 2025, 1:47pm

There is no direct usage info on responses, but you can roughly calculate the costs based on the length audio input for stt and output audio length for tts.

Instructions and input tokens can be calculated by tokenizer.

In this topic there are a few more details:

Topic		Replies	Views
How to count cost of API call for new TTS and transcribe models? API api	1	218	April 3, 2025
How do I calculate the usage cost when using the GPT-4o-mini-TTS model? API assistants-api	5	389	May 18, 2025
Token usage info not available in some endpoints Feedback api-billing	0	71	March 27, 2025
Token usage calculation with streaming responses - is this not supported? Feedback	1	42	June 25, 2025
No way to track token usage in Realtime Transcriptions API Feedback	0	100	May 11, 2025

Objectively tracking price/token usage in v1/audio/speech and v1/audio/transcriptions?

Objectively tracking price/token usage with v1/audio/speech and v1/audio/transcriptions?

What works well

The problem

Constraints

Related topics