The 20 File Limit on assistants is not useful for large Retrieval-Augmented Generation

It seems like assistants negate the need to build one’s own Retrieval-Augmented Generation system, but with a file limit of 20 files per Assistant, it seems like we are not ready yet to replace a solution that uses, let us say, thousands of files as documentation for the AI.

I have been thinking about keeping around my own vector index with the large file collection and when the query does return any file_ids, attach them corresponding to the thread, and then for each user message make the same checks again and attach and detach files as needed. That all seems very inefficient and highly likely slow.

Is there a strategy that is meant to be used?

I would appreciate any guidance here.

P.S. in any case, without stream support, assistants are not a good user experience yet because a loading gif will always be less appreciated than the partial text being generated.


I second this opinion. I was thinking the same thing.
At first I thought this was a per organisation limit, but later discovered that “organisation” still feels broken.

The 100GB limit is somewhat okay - but usually in my case - i prefer using multiple small files - so the file number limit is the problem. Otherwise, Assistants will already completely replace what I’ve built so far ( with VectorDB etc. ).

They might remove this restriction upon removing the free functionality from it they just dont want people flooding the servers right now.

Retrieval $0.20 / GB / assistant / day (free until 11/17/2023)

So after the 17th it might be possible. But keep in mind at 100GB youre going to be paying $20 just to use that without even prompting the bot

That would make sense, but in the mean time, we can only test, not build anything because any effort might just be a waste of time on our part.

Agree with this, but as far as I understand there’s no file number limit to upload with client.file.create, only a total size limit.

Looking at the API, the number of file limits apply to the assistant and to the message, 20 and 10.

So suppose you have 100 files in your client.files index, if there was an API functionality to get the top 20 or 10 given the prompt, I would be ok with that.

yes, but then you better rollout your own RAG because file storage at OpenAI is very expensive