We have a large, mostly static dataset that we want to use with the search and also the answer API.
The problem is that we can upload a large dataset and create a file but we can’t update it.
Ideally it would be a corpus indexed by ID which would mean we can delete, insert or update an individual item within that file.
The problem is right now it’s static and so would quickly become stale for us.
Are there any workarounds here?
Shy of continuously uploading new data and deleting old ones, not at the moment. We’ve had this feature in the backlog for awhile but haven’t pushed on it.
Can you give me some details on your dataset? How large is it / how often does it change, that sort of thing?
Well can’t in public without giving out too many details of our app.
But there are tons of use cases so say an Twitter indexer or gmail or something where you have the underlying content change fairly often.
That would technically be possible but super super super expensive.
Say it was like 50MB , to update 100 bytes I’d have to re-index all 50MB paying for the tokens each time.