One can imagine the scenario happening here is a delay in database propagation in making the URL path of the network request for object retrieval by vector store container file id ready. It hits:
GET https://api.openai.com/v1/vector_stores/{vector_store_id}/files/{file_id}
A status 404 should be an expected or anticipated part of the lifecycle, along with knowing the futility of immediate polling.
Quite frankly, the SDK polling method is not in an appropriate place in code. Taking only an ID and a vector store, it doesn’t have knowledge about the size of the upload and expected latency of document extraction and embeddings of the chunks, nor does the method offer such intelligence parameters to be passed.
Giving OpenAI the benefit, they don’t describe using their simple polling function in file search documentation. The steps for file search are upload, add/attach, and check. However, the idea of listing all files with .vector_stores.files.list() as “check” for a single file attachment there is also poor, because the list method cannot return anything over 10000 files nor has a limit, and polling a huge list per-file would be silly.
However the “retrieval” (semantic search endpoint) documentation is indeed offering .vector_stores.files.upload_and_poll() for the same slot, now also proven poor. The name is also terrible, as the method doesn’t upload anything over the network.
The method that should be used is .vectorStores.files.retrieve()
The polling should have application-level consideration:
- tolerate initial 404
- have a file-size minimum delay expectation
- have delay tuning by file type (such as PDF being more complex, but perhaps less tokens extracted)
- have an aggressive polling during the expectation for user experience
- have a back-off after failing to meet expectation
- have a failure state timeout also extrapolated from the file type and size.
This post, from this issue going on a week, I whipped up a mod of my “real uploader” to tolerate an initial 404. You can port and add some brains:
Better would be:
- No OpenAI SDK bloat for simple API methods
- No OpenAI vector stores if you don’t want a pattern of downtimes
- No file search tool if you don’t want injections saying “user uploaded files”
- No Responses API