Best practice for using multiple documents with File Search in a single Responses API call?

TL;DR: I want the model (GPT-5) to, in one Responses API call, to search through 3 files, each for a distinct separate purpose. What’s the best way to implement this? One vector store with all three? Three vector stores, one file for each? Or something else?


Hi everyone,

I’m building a workflow where the model (GPT-5 - high reasoning effort) needs to, in one single Responses API call:

  1. Check against one document to avoid duplication.
  2. Ground new content only in another document.
  3. Validate/QA against a third reference document.

I’m using the Responses API (not Azure) and I use the Responses API n8n. I’m not a software engineer — I mostly work with no/low-code tools — so I’d like to confirm the most reliable way to structure this from those who are more well-versed in this than me.


What I asked ChatGPT

I asked ChatGPT how to achieve this, and it suggested:

  • Put all required documents into one vector store and attach that store to the file_search tool in the Responses API call.
  • In the prompt, clearly define the “role” of each file (e.g. one is for deduplication only, another for grounding only, another for QA only).
  • Require the model to self-report which file_id it used for each step, and forbid using the wrong file.
  • After receiving the response, programmatically validate the annotations and file_ids in the file_search_call.results to make sure each step used the correct document. If not, treat it as a violation.
  • Optionally: configure ranking options / score thresholds to control retrieval behavior.

This way, the whole multi-step reasoning happens in one call, but with post-validation of which document was actually used.


My question to the community

  • Is this the correct / recommended way to handle a multi-document scenario in a single Responses API call?
  • Is there any downside to combining multiple files into one vector store versus using multiple vector stores?
  • Are there official limits on how many vector stores can be attached to a single call (I’ve seen conflicting info about “up to two” vs. vector_store_ids as an array)?
  • Is there a more reliable way to enforce “Doc A only for dedupe, Doc B only for grounding, Doc C only for QA” within one call?

Would really appreciate input from those of you who’ve worked directly with File Search and the Responses API:
Is ChatGPT’s proposed solution correct, and how would you approach this in practice?

Thanks!