GPT realtime and S2S (speech to speech models)

Saif_Kharouf · November 26, 2025, 1:16pm

I want to ask couple of questions about the pricing and usages of the live models (aka speech to speech model):
’’’ I am using pipecat ‘‘‘’
1 - how we are getting billled for the models for the different modalities (audio to audio) / (audio to text)? Do we get billed for text and audio tokens togther or seperated depending on the modality?

2 - The usages of the model I want to undrestand it better like this one for audio to text

total_tokens=3021 input_tokens=2971 output_tokens=50 input_token_details=TokenDetails(cached_tokens=2880, text_tokens=2937, audio_tokens=34, cached_tokens_details=CachedTokensDetails(text_tokens=2880, audio_tokens=0), image_tokens=0) output_token_details=TokenDetails(cached_tokens=0, text_tokens=50, audio_tokens=0, cached_tokens_details=None, image_tokens=0)

and for the audio to audio

total_tokens=3618 input_tokens=3182 output_tokens=436 input_token_details=TokenDetails(cached_tokens=2688, text_tokens=2848, audio_tokens=334, cached_tokens_details=CachedTokensDetails(text_tokens=2688, audio_tokens=0), image_tokens=0) output_token_details=TokenDetails(cached_tokens=0, text_tokens=101, audio_tokens=335, cached_tokens_details=None, image_tokens=0)

Thanks

Topic		Replies	Views
Realtime API pricing questions: text input and audio tokens API realtime	7	901	December 6, 2025
Confusion Between Per-Minute Audio Pricing vs. Token-Based Audio Pricing API realtime	2	9724	December 28, 2024
Why are there text tokens in Realtime API API api-realtime	1	181	April 22, 2025
Openai gpt-4o-mini-tts price API	1	320	June 4, 2025
Realtime API text and audio API realtime	0	196	October 8, 2024

GPT realtime and S2S (speech to speech models)

Related topics