RAG with more than 10 files

scottswigart · January 14, 2024, 5:24am

I want to basically implement GPTs. I want the user to be able to upload a couple hundred files. I want the user to be able to enter a prompt, and use the Assistants API to service the prompt.

Right now, it looks like an assistant can only reference 20 files, and any given message can only reference 10 files.

How are people addressing these limitations? It seems complicated to iterate over files in batches of 10 and somehow aggregate all that answers into a single comprehensive answer.

Thoughts?

curt.kennedy · January 14, 2024, 6:32am

It’s no more complicated than vectorizing the contents of the files, as chunks of text, and storing the vectors in one structure and the corresponding text in another structure, with some simple relation between the two structures.

Search is just usually dot-products (point wise multiply the vector coordinates, and add up all the numbers), so this is simple too.

All of this can be done without fancy databases … just standard computing paradigms here. Databases mostly help to free up more memory for correlation. So adding one is for resource efficiency. But if you are swimming in memory, they don’t matter so much.

matcha72 · January 14, 2024, 6:36am

You would need to use an external application which can perform RAG and provide the result via api to GPT

scottswigart · January 14, 2024, 10:26pm

What do GPTs do that’s special then? I assumed the point was that they handle the RAG for the documents you provide.

anon10827405 · January 14, 2024, 10:43pm

It depends on which perspective you want. They are a no-code solution for an AI agent. It does come with a lot of limitations.

For clients:
A walled garden that requires a $20/month subscription to OpenAI.
For developers
It abstracts everything to such a high-level that prevents any sort of interactivity, statistics, and control. It also must be used on their website

RAG can be very simple, it has lots of branches, and a high ceiling.

Basic document retrieval works great, and the current OpenAI solution seems to be “Cost doesn’t matter, accuracy does”. So I’m happy to spend their dime on their current retrieval system if it’s anything like the Assistants system.

Besides that, I don’t think they do anything special. (opinion, beware) I think OpenAI is just banking on people flocking and using their store like a typical app store, and that the developers will naturally flock there as well.

P.S. Oh lawd pls don’t use retrieval with Assistants. They have been seemingly untouched since announcement with a lot of critical features still missing

Agreed. To an extent. I’m a huge fan of Weaviate because it makes all the different functionalities of RAG very simple to implement, tinker, and harmonize. I would 100% recommend Weaviate to anyone unless they are truly just looking for simple document retrieval.

scottswigart · January 15, 2024, 3:25am

Interesting - that’s my primary use case - find relevant documents (and parts of documents) to service a user’s prompts. I’m not using GPTs, but AI Assistants through the API, and I’m frustrated that an assistant can only reference 20 documents. It might as well be 10 because a given Message can only reference 10 documents.

I’m getting very inconsistent results. For example, just asking which transcripts have a certain person asking questions will sometimes return 1 document, and sometimes 6. Same exact documents, same exact prompt.

Sounds like your recommendation is to use a vector database instead of AI Assistants?

matcha72 · January 15, 2024, 4:30am

GPTs can handle RAG but upto a limit beyond which you need to use an external solution

anon10827405 · January 15, 2024, 5:20am

I think it’s worth trying out a vector database / knowledge graph and comparing results. There are a lot of powerful analytics and tests that can be performed by hosting your own RAG solution.

SomebodySysop · January 15, 2024, 8:45am

I second this opinion. Their tex2vec-openai transformer has worked almost flawlessly for me in nearText retrievals. And, that’s just one out of several that they have available.

SomebodySysop · January 15, 2024, 8:50am

I think my RAG recommendation from months ago still holds true, even more so in this new wild and wonderful world of Assistants. Converting PDF Files Text into Embeddings - #4 by SomebodySysop

And you can do this with tens of thousands of files.

Topic		Replies	Views
Did assistant api kill manual RAG with vector databases? API	8	6732	December 18, 2023
Overcoming many small files using Assistants Retrieval API assistants	2	1607	November 26, 2023
Problem with doing RAG with 300k pages of PDFs Community gpt-4 , gpt-35-turbo , api	8	5414	March 7, 2024
New "Assistants" API a potential replacement for low level "RAG" style content generation? API	9	8604	March 4, 2024
My GPT - Knowledge base - Best practices GPT builders	7	21089	January 25, 2024

RAG with more than 10 files

Related topics