I want to basically implement GPTs. I want the user to be able to upload a couple hundred files. I want the user to be able to enter a prompt, and use the Assistants API to service the prompt.
Right now, it looks like an assistant can only reference 20 files, and any given message can only reference 10 files.
How are people addressing these limitations? It seems complicated to iterate over files in batches of 10 and somehow aggregate all that answers into a single comprehensive answer.
It’s no more complicated than vectorizing the contents of the files, as chunks of text, and storing the vectors in one structure and the corresponding text in another structure, with some simple relation between the two structures.
Search is just usually dot-products (point wise multiply the vector coordinates, and add up all the numbers), so this is simple too.
All of this can be done without fancy databases … just standard computing paradigms here. Databases mostly help to free up more memory for correlation. So adding one is for resource efficiency. But if you are swimming in memory, they don’t matter so much.
You would need to use an external application which can perform RAG and provide the result via api to GPT
What do GPTs do that’s special then? I assumed the point was that they handle the RAG for the documents you provide.
It depends on which perspective you want. They are a no-code solution for an AI agent. It does come with a lot of limitations.
A walled garden that requires a $20/month subscription to OpenAI.
It abstracts everything to such a high-level that prevents any sort of interactivity, statistics, and control. It also must be used on their website
RAG can be very simple, it has lots of branches, and a high ceiling.
Basic document retrieval works great, and the current OpenAI solution seems to be “Cost doesn’t matter, accuracy does”. So I’m happy to spend their dime on their current retrieval system if it’s anything like the Assistants system.
Besides that, I don’t think they do anything special. (opinion, beware) I think OpenAI is just banking on people flocking and using their store like a typical app store, and that the developers will naturally flock there as well.
P.S. Oh lawd pls don’t use retrieval with Assistants. They have been seemingly untouched since announcement with a lot of critical features still missing
Agreed. To an extent. I’m a huge fan of Weaviate because it makes all the different functionalities of RAG very simple to implement, tinker, and harmonize. I would 100% recommend Weaviate to anyone unless they are truly just looking for simple document retrieval.
Interesting - that’s my primary use case - find relevant documents (and parts of documents) to service a user’s prompts. I’m not using GPTs, but AI Assistants through the API, and I’m frustrated that an assistant can only reference 20 documents. It might as well be 10 because a given Message can only reference 10 documents.
I’m getting very inconsistent results. For example, just asking which transcripts have a certain person asking questions will sometimes return 1 document, and sometimes 6. Same exact documents, same exact prompt.
Sounds like your recommendation is to use a vector database instead of AI Assistants?
GPTs can handle RAG but upto a limit beyond which you need to use an external solution
I think it’s worth trying out a vector database / knowledge graph and comparing results. There are a lot of powerful analytics and tests that can be performed by hosting your own RAG solution.
I second this opinion. Their tex2vec-openai transformer has worked almost flawlessly for me in nearText retrievals. And, that’s just one out of several that they have available.
I think my RAG recommendation from months ago still holds true, even more so in this new wild and wonderful world of Assistants. Converting PDF Files Text into Embeddings - #4 by SomebodySysop
And you can do this with tens of thousands of files.