RAG is not really a solution

sergeliatko · September 14, 2024, 7:08am

That’s cool, that does chunking on sentence level. Many where doing it a long time ago.

The thing is I cannot find any talks about “floating-level” embedding.

Here is how I do it:

After the “atomic idea” chunking and rebuilding the hierarchical structure of the document:

You build “depth limited” outlines not to bring the N number of atomic ideas from deep sub-levels into the embedded vector. Say the max depth you traverse is 2 levels.

Once you build those outlines on all levels from root-to-leaves and embed them, you add those on top of your regular “atomic idea” chunks into the vector DB.

Now each query you send to your RAG operates simultaneously on several levels of abstraction (sentence, section, chapter, document, several documents, etc.), and the results are way more purifier and complete.

If you add some minor logic on sample retrieval (like pre-select samples based on usefulness, pull section full text content when a section is returned by the query and not an atomic idea, etc.) - your results will be WAY better than what anyone can expect.

Topic		Replies	Views
Processing Large Documents - 128K limit API gpt-4	41	7613	November 8, 2024
RAG Evolution with Reasoning Models Community api	10	597	April 30, 2025
RAG is failing when the number of documents increase API	35	18708	December 17, 2024
Scaling RAG chatbot system to millions of documents API gpt-4 , prompt-engineering , rag	18	6370	February 28, 2024
Discussion thread for "Foundational must read GPT/LLM papers" Community gpt-4 , gpt-35-turbo , chatgpt , research	75	10745	September 3, 2024

RAG is not really a solution

Related topics