HyDE with hybrid search approaches

I wasn’t thinking of correlating previous results with the HyDE answer.

I was thinking you get pure answers from the original text by using HyDE, semantics, and keywords. So this is (3) streams of rankings, and then use RRF to fuse them into one ranking.

The only nuance, is that HyDE could be considered some sort of new query (it produces a synthetic query from the original one). So with this you could do 2-leg RRF, one with semantics on the HyDE generated query, one with keywords on the HyDE generated query.

So putting all this together, you have (4) streams (max) to fuse in RRF.

  1. Semantic on original query
  2. Keywords on originally query
  3. Semantic on HyDE generated query
  4. Keywords on HyDE generated query

All of these can be run in parallel, and have no dependencies between them. So do this, and when the last one finishes, fuse them all to a single ranking using RRF.

This is different than what your are saying above, because I am not correlating results from keyword or semantic with anything from HyDE. I am treating each leg as an independent processing stream, which is good for lowering the latency of the overall search.

4 Likes