The latest reference architecture I have floating in my head is basically use a blend of embeddings and keywords, with extensive use of HyDE to steer the query.
Dense:
- Embeddings, take your pick, I would just use ada-002
Sparse:
- Keywords are using my MIX algorithm, similar to BM25, but mine has automatic stop-word detection. (MIX reference details) (MIX reference 10,000 ft level)
Deepen the search with what I call “HyDE projections” (HyDRA-HyDE ??? )
- Let’s say you have 5 different common views of a subject, ask the LLM to generate answers from these 5 perspectives (HyDE), so re-frame the question from these additional perspectives. This re-framing, is all you really need, I think, over a fine-tune, because you are reshaping the data to align to the query by this steering. So a lot of your papers mention fine-tuning as the answer. But I think re-framing from a fixed set of perspectives that you define can be just as powerful. If your subject domain is super rare and unknown by the model, then maybe in that case you need a fine-tune.
So in this scenario, you take the original query, and generate the 5 other queries (5+1), and so you have 6 different pulls
- 6 embedding (dense) pulls
- 6 keyword (sparse) pulls
So you have 12 streams to reconcile, and you just use RRF to do this.
Each stream can be weighted differently by increasing the K factor in the denominator of RRF.