“You can attach a maximum of 20 files per Assistant, and they can be at most 512 MB each.”
Can someone clarify this for me?
I should technically be able to have one assistant with the same instruction and model power thousands of threads.
If anything, I’d think the file limit would be at the thread level, not the assistant.
My app allows users to upload their own files, thus this wouldn’t be a scalable solution.
I think you’re right, it’s just not going to work the way you want it to right now. They might change that file limit but 20 files is way too low for even the most basic cases
Thanks! Is there a limit to the number of assistants then?
Here is what I found out.
We have a max of 2M tokens of use per item that we can use as our data. We are limited to pre data to 20 files. The end user then can upload and ask against the data you have provided. So think of it as pre training data that we can set out into the wild and the items the user parses against it are not counted in the limits. So far after fighting with it I had to create a script which read each file and cut me down to 1.8 million tokens per file and no more then 20 files. I was asked to remove data if I hit my max limits via a prompt I asked the system to calculate for. I wish you luck!
I think the idea based on what I am reading its a temp file system meaning you build a front end to upload the docs than remove them when done that session. under the doc retrieval it seems that’s by design based on the listing , upload and delete functions of that tool.
so basically this is not meant to replace a RAG done using langchain (like azure-search-openai-demo on Github, to be clear): ?
Do you think that this is only a temporary limitation (during the beta phase)?
could be, but if you read other threads on the costs it could get expensive.
Looks like it … the initial RAG tool seems to be very minimalistic.
Possibly during beta. I would be surprised if that is not increased later. 20 files and 2M tokens is tiny for most use cases.
Correct – there seems to be no control on how much of the uploaded “knowledge” is added as context in the API call … which means that each API call could be quite token-heavy (for now). They did mention that they will optimize the RAG later.
True … we have a customer who uploaded 70,000 documents - lol …
I also hit 2M token limit, via a single 300M pdf file. You find out in the end, after it crunches on it (waiting), … not before. Caveat Emptor with “undocumented” RAG processing.
I’m getting about 50% failure of assistants to read text files I upload. anyone else having this problem? messages like
I apologize for the inconvenience, but it seems there’s been an issue accessing the file due to technical limitations with the myfiles_browser tool. I’m unable to open or browse the content of the file you’ve uploaded.
I apologize for the inconvenience, but it seems there was a misunderstanding with the system regarding the uploaded file. I’m unable to access it using my myfiles_browser tool.
Could you please try uploading the file again? This may resolve the issue, and I will do my best to assist you with the summary as soon as I’m able to access the document.
Which means, OpenAI will have to acquire a company like Google Drive or AWS S3 to be able to handle the anticipated storage capacity. They have some pretty big brains over there, so I guess they’ve thought this through?
So, they are uploading then deleting these files every session? Is there some sort of persistent memory in play here, or are they actually processing these file on each session as well? Maybe why that’s why these things are eating up so many tokens!
from what I see the assistant in the playground the files stay there. I believe they use the same method so the 20 files would be there to use the whole time. what I am saying is I think the idea behind the assistant (agent) is that your system uploads up to 20 docs to read. you can than use the delete command to remove if needed to add another. than you can build a smart front end that looks at how many times a doc is used to determine should I keep the docs in memory they have not been used in X hope that helps point everyone on how I would do it. that would be a simple way. than you can build off that too make a better system until they expand the storage which I could see them making that a subscription where you buy more space like google and all the others making openai become a business platform on another level. fun eh? never know haha .
I was in the process of having my code chunk my context to keep things under the token limit and to control costs. Then I saw this part of the API that lets you upload files. People are talking about 2m tokens, etc. If the context is largely static (i.e. bios of characters in a novel writing tool), would it not be much faster and less costly to supply these in files rather than in the prompt? If I was crazy enough to supply a 128K prompt in gpt-4-1106-preview it would cost me $1.28 for a single call. But if I put all that in a file and uploaded it and asked a simple question then it might cost me $0.01? What is the downside to this?
I have the exact same issue (only utf8 txt files), very frequently. It’s annoying (beside the high price)… furthermore I have the feeling that the assistant works different on the playground than via the API , sometimes the playground assistant can’t access the documents (for a longer time) but the API can, and vice-verca… but this can’t be, right?
I’m getting these same error messages via the API right now. It’s really frustrating… and expensive.
I’ve tried 5 assistant prompts using with one large text file, which I pruned and made adjustments too in case it was too long or had text which broke their parser. But I just tried again with 10 HTML files and one very small text file.
I’m getting GPT output like “I noticed that you’ve uploaded multiple files, however, there seems to be an issue accessing them using my browsing tool, which would have allowed me to review any content you may have provided”