Standard RAG + Agent Solution

Hey there everyone!

I’ve been lurking on the forums for a bit, but decided to post a question and get feedback from the community.

I’m working on a Chatbot for my company using GPT models (GPT3.5, GPT4) to perform RAG on our proprietary documents. I’ve explored the ReAct pattern combined with “memory” for conversation tracking, and the use of tools (function calling) . Is there a more optimized method or any recommendations this community can offer based on current best practices?

Additionally, I’m curious about Bing’s RAG implementation. I’ve observed that Bing’s chat can distinguish between follow-up queries and questions that require broader internet searches, along the engine recreating the original question for more searching. I assume there engine is a combination of prompt engineering and multiple LLM calls. Has anyone come across a systematic approach to replicating an efficient retrieval engine?

If there is articles or blogs which has information on what I am discussing, I would greatly appreciate any redirections.


Bing, like ChatGPT with browse, uses OpenAI models with function calls to request information, not automatic augmentation.

You can just offer the AI your own search function that uses many techniques for a specific domain if you want search/follow/browse like those applications.

You might want to check out this thread…

Welcome to the forum.