According to the official documentation (Assistants API > Tools > File Search > Managing costs with expiration policies), it states:
“You first GB is free and beyond that, usage is billed at $0.10/GB/day of vector storage. There are no other costs associated with vector store operations.”
However, I’ve seen some users in this forum report being billed for “file search tool calls” even when using the direct Vector Store Search API (not through the Responses API).
My question:
Is there a per-call/per-query cost when using client.vector_stores.search()?
If yes, what is the rate? Is it the same as the Responses API file_search tool ($2.50 per 1,000 calls)?
Same per-use rate as Responses. When you call the endpoint yourself, you incur the single-use cost. If you use an AI with the file_search tool, it might decide to make multiple “uses” for you..
The big cost of an embeddings-based vector store is paying the up-front costs of the ingesting of files. Then after that, the single embeddings query calls are very cheap. OpenAI switches this up with low barrier, where they recover their costs of providing you no-cost vector store creation with daily storage fees, per call fee, and the large context placement into an AI model if using as tool.
Assistants didn’t have a usage fee for file search, but of course you didn’t have direct access to the results nor could you write your own query. It’s being shut off next year.
I haven’t been able to confirm this on my end yet. When I tested client.vector_stores.search() myself, I couldn’t find where these calls are reflected in my API usage dashboard.
I checked:
Usage > File Searches — shows 1 request, but this was from a Responses API file_search call, not from the direct client.vector_stores.search() call
The issue was that I was filtering by API key in the usage dashboard. When I cleared the filter, the calls showed up correctly. Strangely, filtering by API key only shows Responses API file_search calls, not direct client.vector_stores.search() calls.