How does the 'Call Annie' app achieve such remarkable speed with the ChatGPT API, and is it using stream mode?

RayZ · June 5, 2023, 2:28pm

Hello,

I have been exploring the ‘Call Annie’ app and have noticed that its response speed, reportedly with the ChatGPT API, is remarkably fast. In contrast, when I directly utilize the ChatGPT API, I experience a considerably slower response time. This significant speed difference is intriguing, and I’m keen to understand the underlying techniques or strategies that make this possible.

To clarify, my query is not centered on the “talking head” technology; I’m purely interested in the performance of the AI chatbot. I was wondering if you could provide some insights into how the ‘Call Annie’ app might be optimizing response speed or if there are specific implementation techniques at play. Could it be that ‘Call Annie’ is using the stream mode of the ChatGPT API, which I haven’t tried yet?

I have attempted a variety of methods to optimize the use of the ChatGPT API, including analyzing network data and running performance tests. However, I haven’t been able to determine what might be causing this substantial difference in speed. Any insights or suggestions to shed light on this situation would be greatly appreciated.

Thank you for your time and assistance.

xgbgyn · June 8, 2023, 4:48pm

I have the same question, ‘Call Annie’ really responds fast

chelbling · June 8, 2023, 8:27pm

I’m not familiar with Call Annie specifically but we use streaming responses and its significantly faster at first response, naturally. Its really a perceived speed though; if you receive the first tokens of the response within a second but the whole response takes 10 seconds its going to ‘feel’ a lot faster despite it actually taking 10 seconds. As opposed to waiting for the full response for 10 seconds.

RayZ · June 9, 2023, 10:06am

Dear chelbling,

Thank you very much for your explanation. It has greatly helped me understand the perception of response speed. About the streaming responses you mentioned, I would like to delve deeper.

I am currently using ChatGPT for voice responses, where I send all responses to TTS (Text-To-Speech) for audio playback. This process is quite busy. I’m wondering if streaming could be utilized for generating voice responses. I appreciate your help and look forward to your advice.

Best regards,

chelbling · June 12, 2023, 3:08pm

as long as whatever library/service you’re using for TTS supports streaming input; you should be able to pipe the streamed response from chatgpt to TTS. You might have some weirdness with incomplete words though, so you’ll probably want to have some kind of intermediate step to chunk by word.

weihongqin · July 25, 2023, 6:26am

same to ask, who know the tts used in CallNannie is which one?

Nikoldigital · January 31, 2024, 2:09am

Hii! Do you know how Call Annie was made? I made an AI therapist and would love to use their API or technology to animate my character called Zenon and allow users to speak to her in the same way. I’ve recently been using Bland AI for AI phone calls giving my AI assistant, a phone number. So live video chat with the GPT for Computervision would be iconic.

dsa · September 24, 2024, 2:18am

CallAnnie uses LiveKit. You can achieve similar speeds quite easily. You can call this # for a demo that uses 4o-mini: +16506804883

anon25271712 · September 24, 2024, 7:31am

probably the tts model from openai or eleven labs

Topic		Replies	Views
How can chatgpt voice response so fast? API	5	3784	May 17, 2024
How does ElevenLabs or Deepgram realtime voice agents work as good as OpenAI Realtime API? Community realtime	3	1822	February 26, 2025
ChatGPT API TTS streaming API api	3	5150	January 21, 2025
How to reduce latency with GPT & Unity Requests API gpt-4 , api	2	506	July 3, 2024
ChatGPT API Very Slow at generating Responses API gpt-4 , api	8	5492	December 25, 2023

How does the 'Call Annie' app achieve such remarkable speed with the ChatGPT API, and is it using stream mode?

Related topics