We are building an assistant that serves as an expert guide for multiple, distinct tabletop games/rulebooks (The Bestiary, The League, etc.). All documents reside in a single vector store, but they are differentiated by metadata fields (e.g., game: “The Bestiary” or doc_type: “Rules”).
We want to achieve deterministic, secure retrieval by defining multiple instances of the built-in file_search tool, each with a specific, non-negotiable filters parameter. The reason is because while the logs shows clearly that the model is formulating correct queries (e.g. The correct name of the file or attribute) it is surprised that the file retrieval logic is outputing the wrong passages and gets trap in long reasoning chain that looks rather random.
The Problem (Limitation):
The current implementation of the file_search tool does not allow the definition of a name or description attribute for each instance. This prevents the LLM from reliably and contextually selecting the correct filtered tool.
• Result: The LLM’s selection becomes ambiguous (when no filters are passed it gets trapped in long reasoning chains) and prone to positional bias (e.g., consistently choosing the last file_search tool in the list).
• Workaround: we are forced to be billed a vast amount of input token because each reasoning step (rightfully) complain about the incompleteness of the retrieval.
The Cost of Alternatives:
Since the current method is unreliable, the most robust alternative is to implement a custom RAG router/agent approach (using a single function tool call with appropriate filter for each agent) to generate the filter and then execute the search. This method has significant disadvantages:
1. Increased Cost: It requires two full model turns (Router/Classifier LLM turn + Search Execution LLM turn), effectively doubling the input token consumption.
2. Wasted Reasoning Budget: The model’s valuable reasoning budget is often consumed by the initial classification/routing step rather than being fully focused on synthesizing the final answer.
Am I missing something?