The vector store search as its own endpoint is rather quite missing from the pricing and billing information.
One has to extrapolate that the cost of using the endpoint would be the same as Responses AI calling for search internally.
It does make sense that a single call would be rounded to a display amount, although the old usage page previously presented smaller values.
We can discover the underlying costs by making a series of calls of more significance…
Write a function:
def search_vs(id, query="a placeholder text", max=10) -> dict:
"""use the search endpoint for a query, get "max" results."""
import os, httpx
api_key = os.environ.get("OPENAI_API_KEY")
if not api_key:
raise EnvironmentError("Search: OPENAI_API_KEY not set")
url = f"https://api.openai.com/v1/vector_stores/{id}/search"
headers = {"Authorization": f"Bearer {api_key}"}
body = {
"query": query,
"max_num_results": max,
}
with httpx.Client(timeout=20.0) as client:
response = client.post(url, headers=headers, json=body)
response.raise_for_status()
return response.json()
Maybe then call the “search” endpoint a few times…how about 100x
for _ in range(100):
response = search_vs(id)
print(response)
For what is then ultimately running 300 tokens of cl100k embeddings behind the scenes as the only additional AI:
100
calls thus aligns with the 1000
calls pricing: