I’ve been around figuring out what I wanted to build in the open-source LLM space over the last year. During this time, I’ve focused primarily on RAG, as I believe it is the biggest bottleneck in creating useful AI applications.
For the last 4-5 months, I’ve been developing R2R, an open-source framework for deploying user-facing RAG applications. Think of R2R as an open-source Firebase for RAG, providing everything you need to launch and scale a high-quality user-facing application.
We’ve been working on v2 over the past two months, and it’s finally about ready to go. Here are some of the key features:
Configurable LLM provider support and streamlined builder/factory patterns for application customization
Server-client architecture with REST API endpoints for ingestion, search, and RAG
Additional endpoints for document management (upsert, update, delete, get chunks, global doc info) and user management (delete user, get user docs, get aggregate user stats)
Native hybrid search (semantic + keyword), reranking support, and multimodal file ingestion with support for .txt, .pdf, .html, .json, .ppt, .docx, .xlsx, .md, .mp3, .png, .svg, .jpg, and .mp4
End-to-end observability/logging of ingestion and RAG requests with methods for computing aggregate analytics - choose from SQLite, Postgres, and Redis
Open-source dashboard that connects with deployed applications to showcase all these features
Our multi-modal features are powered by Whisper + GPT-4o, and OpenAI is built as the default provider for the framework.
If you want to check out a quick-start guide I recommend our R2R demo in our docs.
We are a YC backed company from the W24 batch and aren’t planning on pivoting any time soon, so you can count on continued support / improvement of the framework for the foreseeable future.
Sounds like a great project. I’m more of prompt engineer than a coder but you are right. RAG is a major bottleneck and therefore, huge opportunity for someone to solve. Not only that, but depending on how this executes, RAG structuring is how one tool will outperform another tool when relying on the same knowledge documents for their RAG. I look forward to learning more. Thanks for this and for posting.
Smart to build on top of OpenAI. The future winners all do this
I think in the near future “prompt engineer” will go to “RAG engineer”, and hopefully tools like this will enable that transition - so I’m looking forward to getting your feedback when you have the chance to dive in more.
Both of these terms are legally dubious depending on where you are in the world, the term “engineer” generally means one with an engineering degree from a university, but in the US & UK it can pretty much be used by anybody to mean anything.
The term engineer is a protected title in Canada and within most of the EU, claiming to be an engineer without being licensed is against the law, and there’s currently no accredited universities offering programs for becoming a prompt engineer.
Accuracy-wise, this is a function of what modules you want to enable. For best results I would run with Hybrid Search and HyDE (both of which are supported and outlined in the cookbooks here - Hybrid Search)
This looks absolutely amazing! Thanks for sharing the project! As someone just learning Python this will be good project for me to study, and also since I was planning to use Posgres as my Vector DB some day, and that’s what you have!
EDIT: I just downloaded and searched for langchain and didn’t really find it used. You did this all with Python but no Langchain? I always use Langchain in Python apps because it lets me switch to different local LLMs or Cloud providers without rewriting any code. I’ll be interested to see how/if you have this capability to switch providers easily.
I actualy have been having the same thoughts latlely, i devoulped a open-source model called Helia though some parts of it are used from Llama3 like the tokenizer and generation. No idea how to run it though so its still under construction