Using gpt-4 API to Semantically Chunk Documents

Thank you for the kind words. I created this video a year ago, https://youtu.be/w_veb816Asg, and as such, I think it makes me one of the first persons to coin the phrase “Semantic Chunking”.

In RAG, the quality of your model responses are 100% dependent upon the quality of your vector store retrievals. So it’s simple, the better your embeddings, the better your RAG application is going to perform.

While I began organizing my document chunks to be embedded in a more hierarchal manner, I still used the “sliding window” approach when it came to the actual embedding of the text chunks. As a result of this discussion back in early April, RAG is not really a solution - #43 by SomebodySysop, I decided to start this thread and explore how to totally automate a Semantic Embedding process – using only code and the actual models, and without having to rely on LangChain.

Glad I did, because with the help of other participants, including @sergeliatko and @jr.2509 , I have come up with a solution that fits into my embedding pipeline beautifully and – so far – appears to do what I’ve been wanting to do for over a year now.

I would love to make this code available in a public distribution, but the amount of time and effort it would take me to pull it out of my existing infrastructure would be prohibitive. In thinking about this, I realized what would be far easier would be to make the API itself publicly available. Yes, it would be for a fee, but I would basically only charge for the tokens used, with a reasonable markup.

So, to be clear, the idea of a Semantic Chunking API is just that: an idea. I’ve still got plenty work to do to test this thing out on a variety of documents to discover the glitches.

Again, many thanks to everyone who has helped on this project.

4 Likes