That’s cool, that does chunking on sentence level. Many where doing it a long time ago.
The thing is I cannot find any talks about “floating-level” embedding.
Here is how I do it:
After the “atomic idea” chunking and rebuilding the hierarchical structure of the document:
You build “depth limited” outlines not to bring the N number of atomic ideas from deep sub-levels into the embedded vector. Say the max depth you traverse is 2 levels.
Once you build those outlines on all levels from root-to-leaves and embed them, you add those on top of your regular “atomic idea” chunks into the vector DB.
Now each query you send to your RAG operates simultaneously on several levels of abstraction (sentence, section, chapter, document, several documents, etc.), and the results are way more purifier and complete.
If you add some minor logic on sample retrieval (like pre-select samples based on usefulness, pull section full text content when a section is returned by the query and not an atomic idea, etc.) - your results will be WAY better than what anyone can expect.