So I created my first vector store through the UI and uploaded a 11 MB json file to it. The upload happens but soon I see an error: Failed to upload file. I validated the json and its a valid json. The vector store size is 43 MB so it does look like the upload happened but something went wrong afterwards.
Same issue. I have tried various formats.
Same issue with Vector store
You can transform the JSON into what would be recognized as a text file, with some more plain text header and footer, which can also be some metadata about the file to make it more useful. That should allow it to be classified as plain text by the inspector.
Same issue here. Had a client give me a JSON file and tried to put it into a vector store but it errored out. I followed the advice from @_j and it took it.
This doesn’t fit with the list of supported files in the documentation for vector stores. Hope it will be addressed.
I’ve had possibly related error, the issue was API didn’t like some of the key names in JSON (sorry don’t remember which specifically). Try looking for anything generic like “data” and renaming it to see if it helps.
The issue for my json was array items defined on separate lines.
So, the following caused errors:
“metadata”: [
“kw1”,
“kw2”,
“kw3”,
“kw4”
]
While the following was accepted:
“metadata”: [“kw1”, “kw2”, “kw3”, “kw4”]
That seems to be an issue since it shouldn’t matter as long as it is well structured.
I had some success by splitting the files into chunks. I tried to create a vector store with 120.000 products in a single JSON files.
That failed most of the time. I briefly had some success by removing the formatting from the JSON, but I think I just hit some threshold.
Now I am uploading multiple product.json files with 5000 products in each JSON. This seems to work reliably so for.
I’m also trying to upload the entire Wikipedia abstracts dump to OpenAI Vector Stores.
Uploading all of it failed, even with no new lines, text format, etc.
What worked for me was splitting it into chunks.
I think the entire dump (after my post-processing) had around 600MB which the API wouldn’t accept. Truns it I can upload chunks of ~92MB each in text format without an issue.
Very rough estimates, didn’t do any further checking:
500MB might be too much for uploads, but you can get away with chunks of ~<100MB.
Hope it helps
Any news on this? Feels like a pretty basic thing to be able to upload JSON or CSV to the datastore.