Should I modify user queries before semantic search?

anon10827405 · September 25, 2023, 12:01am

Yes, I’m aware of how semantic embeddings work, and also of HyDE. My main point is that transforming a user’s query (especially as a starting point without even trying out the database) seems like a bad idea. It adds a potentially unnecessary layer of complexity that makes it harder to understand what’s going on behind the scenes. It also can “dilute” the meaning behind the user’s query.

It also relies on the LLM to answer the question in such as way that the answer aligns the document results better than the question. So, besides shifting and changing the prompt there really isn’t more room for improvement. If you find that HyDE is failing, what’s next?

It definitely has it’s usefulness, don’t get me wrong. Which is why I said “don’t go with it initially”.

For example, I have a very nuanced database that needs to return precise information. I heavily rely on keywords because of this (but also require semantic embeddings to understand the question). In this database are misspelled words as product names and dimensions that can drastically alter the results. This is a clear-cut case where using an LLM to alter the user’s query would be a terrible decision.

If it sounds like I’m repeating myself. I know. You didn’t address anything I said and instead went on a tangent.

Topic		Replies	Views
How can I use Embeddings with Chat GPT 3-5 Turbo Prompting	39	48330	December 12, 2023
Processing Large Documents - 128K limit API gpt-4	41	7133	November 8, 2024
Prompting with the chat/completions API against a large transcript file API	5	3574	October 4, 2023
Strategies for long RAG conversations Prompting rag	14	5066	May 17, 2024
The length of the embedding contents API	48	33797	December 13, 2023

Should I modify user queries before semantic search?

Related topics