Is it possible to get "context aware" embeddings?

curt.kennedy · June 24, 2024, 5:40pm

You can embed at different granularities, like word/sentence/paragraph/page, etc. Search across all the granularities, then fuse the results to form an overall result (RRF/RSF). Also, you can weight each granularity differently … so paragraphs more than sentences more than words, or whatever you decide.

If each granular chunk had an index indicating the location within the corpus, you could also grab adjacent chunks to provide more surrounding context.

So search at all your granularities, fuse the result, grab the highest fused results, expand each result by some radius, then these are your chunks for RAG.

Topic		Replies	Views
How can I send vectors as a chat context? Prompting embeddings	8	7827	May 15, 2023
Best method of injecting relatively large amount of context to be leveraged in a response API	10	10073	December 17, 2023
Embeddings results using Ada-Embedding-data-002 API	10	2364	March 29, 2023
Prompting with the chat/completions API against a large transcript file API	5	3429	October 4, 2023
Similarity of embeddings at different contextual levels Community embeddings	4	1359	July 29, 2023

Is it possible to get "context aware" embeddings?

Related topics