Eleven labs seem to be much faster than Open AI in text to speech (tts)

cnd · March 31, 2025, 11:49am

latency means delay.
“Average generation time…” is completely irrelevant.
Nobody cares how long it takes to convert text to audio, the only thing important is how long between sending the first word, and getting back the start of the audio. You know - the latency …

They advertise “2s to 4s” for target times. Which is really weird. Even the google non-streaming API gives 400ms or less - 10 times faster - and that’s not even their streaming endpoint - you get the entire sentence audio back in one go, before you can talk it. 400ms after you sent the text…

Topic		Replies	Views
How does ElevenLabs or Deepgram realtime voice agents work as good as OpenAI Realtime API? Community realtime	3	2884	February 26, 2025
TTS API service usability API tts	17	7456	December 16, 2023
Anyone using OpenAI Realtime API with ElevenLabs voices? API realtime	9	2391	January 11, 2026
Calling TTS from a Swift app API swift	9	3218	April 13, 2024
How can chatgpt voice response so fast? API	5	4293	May 17, 2024

Eleven labs seem to be much faster than Open AI in text to speech (tts)

Related topics