How does the 'Call Annie' app achieve such remarkable speed with the ChatGPT API, and is it using stream mode?


I have been exploring the ‘Call Annie’ app and have noticed that its response speed, reportedly with the ChatGPT API, is remarkably fast. In contrast, when I directly utilize the ChatGPT API, I experience a considerably slower response time. This significant speed difference is intriguing, and I’m keen to understand the underlying techniques or strategies that make this possible.

To clarify, my query is not centered on the “talking head” technology; I’m purely interested in the performance of the AI chatbot. I was wondering if you could provide some insights into how the ‘Call Annie’ app might be optimizing response speed or if there are specific implementation techniques at play. Could it be that ‘Call Annie’ is using the stream mode of the ChatGPT API, which I haven’t tried yet?

I have attempted a variety of methods to optimize the use of the ChatGPT API, including analyzing network data and running performance tests. However, I haven’t been able to determine what might be causing this substantial difference in speed. Any insights or suggestions to shed light on this situation would be greatly appreciated.

Thank you for your time and assistance.


I have the same question, ‘Call Annie’ really responds fast

I’m not familiar with Call Annie specifically but we use streaming responses and its significantly faster at first response, naturally. Its really a perceived speed though; if you receive the first tokens of the response within a second but the whole response takes 10 seconds its going to ‘feel’ a lot faster despite it actually taking 10 seconds. As opposed to waiting for the full response for 10 seconds.

1 Like

Dear chelbling,

Thank you very much for your explanation. It has greatly helped me understand the perception of response speed. About the streaming responses you mentioned, I would like to delve deeper.

I am currently using ChatGPT for voice responses, where I send all responses to TTS (Text-To-Speech) for audio playback. This process is quite busy. I’m wondering if streaming could be utilized for generating voice responses. I appreciate your help and look forward to your advice.

Best regards,

1 Like

as long as whatever library/service you’re using for TTS supports streaming input; you should be able to pipe the streamed response from chatgpt to TTS. You might have some weirdness with incomplete words though, so you’ll probably want to have some kind of intermediate step to chunk by word.

1 Like

same to ask, who know the tts used in CallNannie is which one?

Hii! Do you know how Call Annie was made? I made an AI therapist and would love to use their API or technology to animate my character called Zenon and allow users to speak to her in the same way. I’ve recently been using Bland AI for AI phone calls giving my AI assistant, a phone number. So live video chat with the GPT for Computervision would be iconic.