Hello there everyone. Been trying to do something, not sure if it’s possible. I’m trying to build a bot with an upload feature using assistant api that is connected to an external database with all my country’s laws, regulations, acts, jurisdictions, cases etc. it’s meant to help lawyers with their day to day operations, but I wanted to know… when the assistant api is connected to an external database, is there a size limit to the size of that database?
Also, if the bot has an upload feature, can I upload multiple files to it and ask it questions about those multiple files?
Hi!
Attaching files to an assistant on the message level works essentially the same as attaching them on the assistants level.
The limits are:
The maximum file size is 512 MB and no more than 2,000,000 tokens (computed automatically when you attach a file).
Regardless the size of the database the limits apply. You should only add the needed documents in the first place to save tokens and money.
You can use annotations to refer to specific files in the model response:
https://platform.openai.com/docs/assistants/how-it-works/message-annotations
Thank you for your timely response. I appreciate it. The database is updated regularly because of new laws, legislations and cases. I wish I could choose which documents are relevant.
It’s more like a co counsel, but for Africa. How does co counsel decide which files are relevant and which are not?
Yes, you have identified a hard limitation with the assistants API which by itself cannot select relevant files from a large corpus of files dynamically on a per request basis.
Typical workaround is to add all the text into a few very large files and enable the retrieval function.
Otherwise you would have to build a function which will select the matching files and then upload these on the message level.
At this stage you are almost better off building your own RAG system.
Ps. I didn’t add this link before, but it’s a recommended reading:
https://platform.openai.com/docs/assistants/tools/knowledge-retrieval
I see what you’re saying. I really appreciate your cooperativeness. I had thought of that idea, of putting all the texts into one file and adding the retrieval function, but the thing is, this database is updated on a regular. It would be exhausting to download and upload the files from the site to the software everyday, which can get annoying. This is why I wanted to use the database’s API so the Ai is updated automatically.
However, I am aware that the file size limit for each file is 512mb for assistants API. Is there a limit on the number of files though?
There used to be a limit of 20 files but I can’t find it in the documentation right now.
Maybe just go to the playground and upload files until you hit a limit, or not.
There will definitely be options to automate the process of regular updating the knowledge files without using an AI other than helping with the initial set-up, the tests and the bugfixes.
Dealing with law texts is a challenging retrieval due to their complex interdependencies. When using the assistants retrieval tool your options for improvement will be limited. I’d say it’s definitely worth the effort upfront to build your own RAG.
But then again: if you set-up a function call to determine the files you need and then add those to the message for retrieval, it should work. If you want to stick closer to your original plan, this would be the way.
Your links are really informative. They helped shed some light on some issues. I can see how function calling can make things simpler and easier…
However…how does casetext do it? I mean the co counsel feature. It carries a lot of data and it uses gpt4. They don’t have their own RAG. Do they simply use function calling? Might you know some information on how they connect their gpt to the database?
I don’t know the casetext app or it has been build. I would expect that they did not use the assistants API due to the many limitations and costs associated with this specific approach. Let’s make sure we are talking about the same thing:
When using a LLM and providing it with additional external context for the model reply then this is called RAG.
If your goal is to simply return the matching documents from a database without having the LLM using it to generate responses then, yes, it’s not RAG but a simple database query app.
Additionally it would be the choice of the person or company building an app to call it a ‘assistant’, regardless of them using the Assistants API from OpenAI.
Well, you really know your stuff sir. I didn’t know how expensive the assistants API would cost…wow!
You were right about making my own custom RAG. However, if I wanted to create a RAG for the legal and insurance industry, that has corpus amounts of data from PDFs, external APIs as the knowledge base and also has a feature for uploading documents for analysis and comparisons based on the knowledge base…just like the open ai assistant, how would you recommend I go about doing that? Would llamaindex be of help?
Please do know am not a coder or a guy with a heavy budget… hahaha. Just a guy trying to make a buck as well.
Oh…and happy Easter!
Hi @cydronem . I am working in the same industry. I also see that OpenAI assistant has limit of 10000 docs in its vector database. Is there any other way with LlamaIndex so we can increase this documents limit