Use knowledge base with realtime api?

What’s the best way to use my own knowledge base with the realtime API?

Would it be through function calling? RAG? How would it work?

Thanks!

1 Like

That’s an interesting question, and here’s how I would do it, especially considering every AI turn can be increasingly expensive.

Have user input-based RAG returns placed into a section of the system message, which is session.update → session, instructions. That can only work if you are using text input or some very fast transcription on user voice running outside of the realtime API.

The benefit is that you haven’t automatically doubled the cost of the turn and increased the delay from having the AI write a function output, send the return value back to the AI server-side chat history, and then run the whole input context again.

Then a voice conversation appears continuous after the system message, so the AI should never drop out of voice.

If OpenAI ever implements context caching for significant discounts, such early changes in the context could break that. Then you can consider two new messages, one for “here’s information relevant to the user’s next question”, and one for user voice. That would take trials to see quality of ongoing chat responses, and implementing the extra work of trying to delete that retrieval message back out of a chat quickly so they don’t pile up, increasing costs and reducing attention. That is like a function call with no call.

I thought more about this, an easy way could be to populate the “memory” with the knowledge base, such that it will have access to it and retrieve as the conversation goes on.

1 Like