Bug: Vector store status: completed does not guarantee searchability - file_search returns empty results silently

morritse · February 17, 2026, 7:54pm

Bug: Vector store status: completed does not guarantee searchability - file_search returns empty results silently

Summary

After creating a vector store with vectorStores.create({ file_ids: […] }) and polling until status === “completed”, immediate file_search tool calls against the store return empty or sparse results. There is no error, no warning, and no way to distinguish “index not ready” from “document doesn’t contain this content." The index becomes fully queryable after an indeterminate delay (5+ seconds post-completion).

Environment

API: Responses API (responses.create) with tools: [{ type: “file_search” }]
Model: gpt-5.2
Document: ~2.1M character PDF (1,200+ pages), preprocessed into raw text with markers, uploaded via files.create then attached to vector store
Reasoning effort: tested at none/low/med (I initially thought it was a reasoning issue; higher reasoning led to better results because responses simply took longer, giving the vector store more time to spin up)

Reproduction steps

Upload a large PDF via files.create({ purpose: “assistants” })
Create a vector store: vectorStores.create({ name: “…”, file_ids: [fileId] })
Poll vectorStores.retrieve(storeId) until status === “completed”
Immediately start making responses.create calls with tools: [{ type: “file_search”, vector_store_ids: [storeId] }]
Observe that early calls return responses with no retrieved content, while later calls (same store, same document, same prompt structure) return rich results

Evidence

I ran 8 generation passes over the same 11 comments (this pipeline answers peoples’ comments against a source document) against the same document. Each run creates a new vector store, polls to completion, then processes comments in batches of 5.

Input token counts directly measure how much content file_search returned (higher = more retrieved chunks). The system prompt + comment text alone is ~9-10k tokens.

Run: 1 (pre-existing store attached) (Expected Behavior)
Reasoning: low
Batch 1 comment 1: 77,916
Batch 1 comment 2: 76,582
Batch 1 comment 5: 93,939
Batch 2 (all): 25-43k
────────────────────────────────────────
Run: 2 (Bug Present in runs 2-7)
Reasoning: low
Batch 1 comment 1: 10,902
Batch 1 comment 2: 9,535
Batch 1 comment 5: 110,223
Batch 2 (all): 25-43k
────────────────────────────────────────
Run: 3
Reasoning: low
Batch 1 comment 1: 9,578
Batch 1 comment 2: 10,962
Batch 1 comment 5: 72,879
Batch 2 (all): 17-53k
────────────────────────────────────────
Run: 4
Reasoning: none
Batch 1 comment 1: 9,448
Batch 1 comment 2: 10,298
Batch 1 comment 5: 9,722
Batch 2 (all): 17-26k
────────────────────────────────────────
Run: 6
Reasoning: none
Batch 1 comment 1: 9,471
Batch 1 comment 2: 9,585
Batch 1 comment 5: 10,289
Batch 2 (all): 25-26k
────────────────────────────────────────
Run: 7
Reasoning: none
Batch 1 comment 1: 10,911
Batch 1 comment 2: 9,630
Batch 1 comment 5: 10,289
Batch 2 (all): 25-26k

Key observations:

~9-10k input = zero file_search content returned. The model receives only the system prompt and comment text. It searches the store, gets nothing back, and responds with “The administrative record does not contain…”, which is factually wrong; the content is in the store, it’s just not searchable yet.
Run 2 batch 1 shows the index coming online in real time: 10k → 10k → 60k → 27k → 110k across 5 sequentially-processed comments in the same batch.
Run 1 had a second, pre-existing vector store (created in a prior session) attached alongside the new one. That run had no retrieval failures because the pre-indexed store provided content immediately.
Batch 2 is always fine (~25-43k input) because by the time batch 1 finishes processing (~60-90 seconds), the index is “fully warm”.

Impact

Silent data loss. The model produces confident but content-free responses. There is no error or signal that retrieval failed, the file_search tool simply returns no results. The consumer cannot distinguish “index not ready” from “content not found.”
Non-deterministic output quality. Identical inputs produce dramatically different outputs depending on timing relative to index creation. This is invisible without inspecting token counts, and left me scratching my head for quite a while.
No workaround signal. There’s no file_counts.indexed field, no search-readiness endpoint, and no error on the file_search tool call. The only way to detect this is to monitor input token counts or do a test query, neither of which the API is designed to support.

Expected behavior

Either:

status: “completed” should mean the index is queryable, don’t report completion until search works, OR
Add a distinct status like “search_ready” that indicates queryability, OR
Have the file_search tool return an error/warning when querying an index that isn’t fully propagated (e.g., “status”: “index_warming” in the tool result), so consumers can retry

Current workaround

I added a post-poll delay and a test probe query loop on my side, which has worked so far, but this is guesswork since there’s no API signal for when the index is actually ready.

Topic		Replies	Views
File_search returns results from deleted files no longer linked to the vector store Bugs	2	116	November 6, 2025
Vector Store Indicates Completed When it is not Bugs vector-store	0	44	November 26, 2025
Assistants API File Search Fails despite vector store status="completed" Bugs assistants-api	0	185	October 7, 2024
File Search in Assistants API is broken Bugs api , assistants-api	73	2254	September 11, 2025
No polling mechanism to check if a file has been indexed (is ready) API api , assistants-api , assistants-files , vector-store	2	220	March 20, 2025

Bug: Vector store status: completed does not guarantee searchability - file_search returns empty results silently

Bug: Vector store status: completed does not guarantee searchability - file_search returns empty results silently

Summary

Environment

Reproduction steps

Evidence

Key observations:

Impact

Expected behavior

Current workaround

Related topics