RAG is not really a solution

,

In my thinking, I’ve already done this “pre-chunk” using Semantic Chunking: https://youtu.be/w_veb816Asg?si=yr4TLKFi_sGex4Pm

Now, I want to semantically chunk these semantically chunked pre-chunks (I hope that sentence makes sense!) into semantically complete pieces of text that are no larger than x tokens.

At this point, we are talking about sections or subsections or chapters or sub-chapters .

In the case of The Bible, for example, we would be talking about breaking down the chapters by verse.

In the case of The Talmud, we are talking about breaking down the tractates by “dafs”.

In a legal contract or municipal code, we would be talking about breaking down sections or sub-sections semantically.

My thinking is that we could use the model to do this – we just need the right prompt which tells it what to do.

Yes, keep us updated. I’m sure you’ll come up with that prompt!