RAG with Realtime API - samples / gudelines / best practices?

sv2 · October 4, 2024, 8:01pm

Any samples / gudelines / best practices so far on how to do RAG efficiently with Realtime API ?
Interested in keeping latency low. Cost is important factor of course.

stevenic · October 5, 2024, 1:49am

You’ll want to use tools. I don’t have an example yet but given the cost of the RT API you want to keep your main voice conversation as small as possible. Tools seems like the key to bridging over to RAG in a way that’s cost effective.

shrey14 · October 15, 2024, 7:33pm

can you mention any specific tool that might be helpful here?
I am stuck at the same problem

stevenic · October 16, 2024, 1:06am

You can give it a tool called “search(query)” that does the rag parts. That tool can make another model call and return an answer that the assistant will read back.

shrey14 · October 16, 2024, 3:39am

Can I request a sample code from you?

alden · November 5, 2024, 2:14am

I open sourced this : GitHub - adorosario/openai-realtime-with-customgpt-poc: POC Using OpenAI Realtime API with CustomGPT for RAG And Twilio Voice

You can rip out the CustomGPT .ai RAG if you want and replace it with whatever endpoint your RAG is going to be.

Latency is going to be an issue for sure (especially if your RAG is not super fast) – had to implement a UX “typing” sound (similar to “…” GIF in text chatbots - LOL!)

Topic		Replies	Views
RAG with voice-voice(end-end) RealTime API API api	17	6017	January 19, 2025
Title: Use RAG with Real-Time API for Call Tool API api-realtime	0	291	February 27, 2025
RAG with RealTime and Web Socket Relay (Push To Talk and VAD) API realtime	2	565	October 15, 2024
OpenAI Realtime API w/ Twilio + RAG == AI Call Center Community project , rag , realtime	4	4795	November 5, 2024
Realtime API - What events should be handled? (e.g. for call centers) API	8	1780	October 16, 2024

RAG with Realtime API - samples / gudelines / best practices?

Related topics