file_search decides on its own how many search queries it emits. I’ve observed anywhere from 1 – 5 queries per call, making not only latency and cost unpredictable but also retrieval accuracy: too few queries miss relevant chunks, while too many introduce noisy hits that crowd out the best matches. There’s no way to specify on at least N queries when recall matters, or to cap the count when precision is key.
Proposed change
Add two optional integers (assistant-level or per-run):
Allow developers to write their own replacement tool language instead of the bad prompting that is used.
The internal tool language of file search talks about sending multiple queries in the prompting. Yet the language there is overgeneralized, is described like a pattern of a ChatGPT user uploading their attachments, and cannot be enhanced by knowledge of when file search or any other tool is actually useful to call in a developer application or what will be found.
This tool is also burdened by system message injections before every input with wrong information about the user providing the files, making it pointless and without data guardrails.
Or: put in the documentation that it is a minimum-effort, minimum-viable-product, untuned and unsuitable example meant to never encroach on the quality of any document database search a developer could market themselves in the field of semantic retrieval, with a per-use cost meant to discourage product development with it, and at a minimum, to use the API “search” call for vector stores alone with your own callable developer function as AI interface.