Wrong Doc for Create vector store file batch

https://platform.openai.com/docs/api-reference/vector-stores-file-batches/createBatch?lang=python

This show a request example like:

from openai import OpenAI
client = OpenAI()

vector_store_file_batch = client.vector_stores.file_batches.create(
vector_store_id=“vs_abc123”,
files=[
{
“file_id”: “file-abc123”,
“attributes”: {“category”: “finance”},
},
{
“file_id”: “file-abc456”,
“chunking_strategy”: {
“type”: “static”,
“max_chunk_size_tokens”: 1200,
“chunk_overlap_tokens”: 200,
},
},
],
)
print(vector_store_file_batch)

But that is not available in latest SDK to allow to specify attributes per files as in example above. Is that documentation for upcoming version or the doc is obsolete/wrong?

1 Like

Status: new

Here is the path where file batches-> create is available in the Python SDK:

Following commits back through when this was renamed from beta (to break every Assistant code), the “files” parameter was not, and has not been there.

In fact, I find that it is a new API addition; my copy of the OpenAPI specification retrieved on Oct 7 only has has “file_ids”, and it is required instead of optional allowing the alternate input.

Your Python example from before the unsupported API reference revision:

from openai import OpenAI
client = OpenAI()

vector_store_file_batch = client.vector_stores.file_batches.create(
vector_store_id="vs_abc123",
file_ids=["file-abc123", "file-abc456"]
)
print(vector_store_file_batch)

The benefit of the “files” array seems to be per-file attributes metadata and chunking strategy, where one might even add the file name as a key-value instead of all files being the same.

So I guess wait for the code to be patched, patch it or write it yourself to see if the parameter is actually active on the API (I’m not so invested to check), or pass it with “extra_body” (against “file_id” still being required).

The batches method itself I’ve found cannot work alone or autonomously, as the API has a high rate of file failure and doesn’t produce an exception report, and the 10000-file-max listing itself isn’t even a reliable guarantee currently. The feature simply generates new problem report forum topics here. So writing your own higher-level code seems appropriate anyway.

post https://api.openai.com/v1/vector_stores/{vector_store_id}/file_batches

1 Like

Thanks for the feedback. Given the fact that I uncovered another major issue that is a deal breaker (Deleted filed in a vector store still show in a “in_progress” status in file list endpoint) and that the Agents SDK file search only support OpenAI vector store, I am now leaning on migrating all our dev efforts away from OpenAI. We’ve been a heavy usage Tier 5 customer for a long time, but I am losing all confidence that what sounded promising (Agent SDK + ChatKit + Vector Store) is really just an afterthought and will become vaporware quickly. I think that OpenAI focus is on the masses and not really on developers. Very disapointing. :frowning: