How to reduce latency with GPT & Unity Requests

Unity · June 5, 2024, 9:57am

Hi, i want to develop a realtime ai assistant with unity. I’m using models below:

Whisper
GPT 4o
TTS-1

But the respond duration is too long. 5-20 secs

Is there an API for Voice Mode in GPT, are there any ways to reduce latency?
Thank you!

AcertingArt · June 5, 2024, 5:08pm

Change this:
Whisper → Deepgram (nova-2 model) using websockets
TTS-1 → Elevenlabs (Eleven Turbo v2 model) using websockets with optimize_streaming_latency = 3

Use OGG for the audio format.

For GPT-4o, initially get a filler phrase repeating the last thing you asked using GPT-3.5 for faster response and lower latency. Then, obtain the full answer from GPT-4o.

Topic		Replies	Views
How to overcome latency in response API gpt-4 , chatgpt	3	1519	February 19, 2024
How to reduce Latency for realtime conversation using whisper API	1	131	June 22, 2024
Improve response time of GPT API gpt-4	1	815	December 30, 2023
Optimizing GPT4 request & best practices API	0	539	April 3, 2023
How can chatgpt voice response so fast? API	5	1346	May 17, 2024

How to reduce latency with GPT & Unity Requests

Related Topics