I have been able to solve the issue of “context aware” embeddings this way: Using gpt-4 API to Semantically Chunk Documents - #172 by SomebodySysop
- Each embedding chunk has a metadata property which uniquely identifies it and it’s position in the chunked document.
- When the chunk is retrieved by cosine similarity search, I programmatically use the chunk identifier to locate adjacent chunks.
- I send the original key chunk with adjacent chunks to the model along with question and chat history to render a response.
So far, this is working very well. My chunks can be as small as one sentence or as large as multiple paragraphs – I can adjust how many adjacent chunks are returned depending upon the type of documents processed. This would be the adjacent chunk “radius” as defined by @curt.kennedy
I actually shared the code I used to do this here: Retrieving “Adjacent” Chunks for Better Context - #10 by SomebodySysop - Support - Weaviate Community Forum