Responses API file_search intermittently returns zero results for the same prompt

I’m building a RAG assistant using the Responses API with the file_search tool and an existing vector store.

I’m seeing intermittent false-negative retrieval. The same prompt, sent in separate new chats, sometimes retrieves the expected files and answers correctly, and sometimes file_search is invoked but returns zero results. There are no API/platform errors. The response completes normally.

Setup:

const tools: any[] = tenantData.vector_store_id
  ? [
      {
        type: "file_search",
        vector_store_ids: [tenantData.vector_store_id],
        max_num_results: 10,
      },
    ]
  : [];

const response = await openai.responses.create(
  {
    model: "gpt-5-nano",
    instructions,
    input: message,
    tools,
    previous_response_id: (conv.last_response_id as any) ?? undefined,
    stream: true,
    include: ["file_search_call.results"], 
  },
  { signal: request.signal },
);

Example prompt:

Summarize the work experience of the individual in the documents.

Failed run:

{
  "type": "file_search_call",
  "status": "completed",
  "resultCount": 0,
  "queries": [
    "Summarize the work experience of the individual in the documents.",
    "work experience of the individual in the documents",
    "individual's work history in the uploaded documents",
    "CV or resume in uploaded documents",
    "professional experience of the individual named in the documents"
  ],
  "resultFiles": []
}

Successful run with the same prompt, same vector store, separate new chat:

{
  "type": "file_search_call",
  "status": "completed",
  "resultCount": 10,
  "queries": [
    "Summarize the work experience of the individual in the documents.",
    "What is the individual's work experience mentioned in the uploaded documents?",
    "CV or resume included in the documents: summarize work history.",
    "Work history of the person described in the documents.",
    "Profile or professional experience of the individual in the files."
  ],
  "resultFiles": [
    { "filename": "<resume document>", "score": 0.9306 },
    { "filename": "<resume document>", "score": 0.925 },
    { "filename": "<project report>", "score": 0.8229 },
    { "filename": "<project report>", "score": 0.8015 }
  ]
}

This does not look like the model choosing not to use the tool. The tool is called in both cases. The issue is that one completed file search returns no files, while another completed file search for the same prompt/vector store returns the expected files.

An identical prompt, only seconds apart will either fail to yeild file search results or return the correct response, seemingly at random.

This behaviour has only started in the last week or so, nothing in my codebase or vector store has siginificantly changed, and reverting to a month old version does not resolve this issue.

Thank you for any help.

Update — reproduced with direct vector store search and request IDs

I did further testing to isolate this from model/tool orchestration.

I wrote a standalone Python script that calls direct vector store search in a loop using client.vector_stores.with_raw_response.search(). This does not use the Responses API, does not use a model, does not use streaming, does not use previous_response_id, and does not involve my application code.

Each search is an independent API request. The script cycles through 3 fixed static queries against the same completed vector store.

Results:

  • Total searches: 50
  • Zero-result searches: 12
  • Success rate: 76.0%
  • Same vector store
  • Same files
  • Same fixed queries
  • 2s delay between calls
  • No thrown API errors

This strongly suggests the issue is happening at the direct vector store search layer, not only in model-managed file_search.

Latency pattern:

  • Every failed search returned in under 1.1s
  • Successful searches usually took 3-7s
  • The failed calls completed normally, but returned data: []

Failed request IDs:

req_eef165f83df642c58fbc57221a919be4 duration=0.92s
req_d39e20cd27e14d7d98efd05732cf9c99 duration=0.59s
req_b64dc6f03f5544d787e4218ba3b05a61 duration=0.58s
req_1ac8cdfd4dcc4abf8eca82090d5ad446 duration=0.80s
req_c152fe526cbd4f06aebf7f15383569fc duration=0.61s
req_14228e4bd4e74c1caf2649f91dddf845 duration=0.53s
req_56d118c305fe448bbf025674da21592a duration=0.49s
req_089480f7c932927c8995246d82358128 duration=1.06s
req_2ced8d46f16340fb8a45dc7b5e023e72 duration=0.53s
req_0b48d9877241446a96786e3133ae851f duration=0.48s
req_a17b30f9c9a54c9b913abb3ab4fe70f8 duration=0.52s
req_cd9fe65c4f6c4f948905d69b55652dc3 duration=0.50s

Successful request IDs for comparison:

req_e79b1fd743ad479ea08892d5e8821921 duration=6.69s
req_7fec70da0624438aa59a7f7d68c767f0 duration=5.20s
req_4d73bbfdc3ee404d89e91d8b03776977 duration=4.28s
req_6871b0fa18d742b0a770b9624e2a493a duration=4.20s
req_8f54b2112ffb44cc9a3868d08ae9142d duration=5.74s

Minimal version of the test script:

import os
import time
from openai import OpenAI

VECTOR_STORE_ID = "your-vector-store-id"
QUERIES = [
    "query one",
    "query two",
    "query three",
]
RUNS = 50

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

for i in range(1, RUNS + 1):
    query = QUERIES[(i - 1) % len(QUERIES)]

    started = time.time()
    raw = client.vector_stores.with_raw_response.search(
        vector_store_id=VECTOR_STORE_ID,
        query=query,
        max_num_results=10,
    )

    duration = round(time.time() - started, 2)
    data = raw.parse().model_dump()
    results = data.get("data", [])
    request_id = raw.headers.get("x-request-id")

    print(
        f"{i:02d}/{RUNS} results={len(results)} "
        f"duration={duration}s request_id={request_id}"
    )

    time.sleep(2)

context:

  • Vector store status: completed

  • File count and contents unchanged during the test

  • Reproduces with direct vector store search

  • Also observed through Responses API hosted file_search

  • Approximate zero-result rate: 24-33% per individual search call

  • Issue started approximately 1-2 weeks ago with no application/vector store changes