Anyone have tips on how to approach the “give me a list of X” type questions?
I found that semantic search is not enough for these, because of its “retrieve top-k” nature. If details are scattered throughout multiple pages in the document (or many documents), typical RAG + semantic search is not enough.
Example:
- List all the requirements for X listed in this document (fails especially if the details of each requirement is long, and requirements are spread across multiple pages). At best, it will list 5 out of 20 requirements listed in the source bc it uses the top N chunks for the answer.
- Summarize this document (gives a convincing answer, but doesn’t always capture all the key points before generating a summary)
Things I’ve tried:
- LLM chaining
- ex. “List all the requirements for X”
- step 1: loop thru all chunks/pages → GPT adds to a list of requirements any time it sees a new requirement it hasn’t seen before
- step 2: do a top-k search for each requirement in list from step 1
- business logic (“give me a list of requirements for X” → filter chunks by metadata X → retrieval on the filtered chunks)
- for my use case, there is a limited set of “list x” type of questions. I could scan the entire document and calculate it ahead of time, since it’s slow/expensive at search time.
Things I’m considering (but require a lot of research):
- Create knowledge graph of the document (create entities, then create a graph of chunks linked together based on those entities and their relationships. Use GPT to create entities from the chunks in a data pre-processing step)
- Agents. Incorporate ReAct type prompting/chaining. I heard these are hard to debug and unreliable for the most part. Would love more data points here.