For my app I have one “master” assistant and a dedicated thread per user, so each user has their own context/experience. I want it so that users can upload personal files in their thread (so NOT attaching it to the assistant during the assistant creation process with assistants.create(), but instead with threads.messages.create() at the thread level). Thus in the thread the assistant will give responses specific to the files that a given user has uploaded in their own thread (and not refer to other users files uploaded in separate threads).
My question: Can a user upload more than 20 files in their own thread? Ie. does the 20 file limit only apply to when you’re creating an assistant and attaching the files at “assistant level”? I’m praying this is the case as it’s expected my users will upload more than 20 files to their thread over the course of their usage lifecycle…
Well, unfortunately, yes, each assistant appears to only be able to hold 20 files maximum, regardless of threads.
I do like your method of thread management vs. assistant management, but when you get RAG involved, things get trickier.
You may have to develop this kind of architecture/framework yourself, which would ultimately mean a lot more work to achieve what you originally want. To give you an idea, I basically developed my own data structure mimicking the thread structure using chat completions but with a little more control, on top of working with my own RAG db, which means spinning up a neo4j / supabase db, and managing how data is piped between queries and the db. I’m happy with it, but it does take time and effort.
That being said, I do know OAI should be updating their assistants quite soon, and fingers crossed better message & thread management comes with that. So, hopefully, you problem because a lot more solvable in the near future.
So I did some testing by uploading 22 simple, unique text files in a thread. They all uploaded successfully. So you can do more than 20??? I’m so confused. Do you know why? See the image below: I simply print out all the file_ids I have in the thread.
Nope, sadly I have no clue. I am not OpenAI staff, so your guess is as good as mine.
Even still, I’d be very cautious about assuming you can go beyond 20 files until something more is announced/we hear more.
If you expect a high volume of users, each with their own ability to upload their own files, I would still recommend managing some kind of RAG db yourself, because there’s a lot of room for things here to get unwieldy, and there’s some edge cases with this feature specifically that can get wonky (like unintentionally retrieving files from different threads).
Tl;dr, It’s still in beta, meaning it’s still a bit unstable.
I see. Do you have a recommendation for which RAG setup I should do, as someone who’s never done this kind of set up before? I’m a junior engineer. App is fairly simple, each user just needs their own thread.
I think something like supabase would be a good start. You can check out the cookbook on it here, which is a valuable resource to have on hand in general.
If that becomes too complex (or if it interferes too much with what your company’s got in development), I would recommend giving each user their own assistant instead, so you don’t have to end up creating your own workarounds. The trade-offs are ultimately up to you and the team you work with.
containerization, with a 1:1 correlation between user and assistant. Essentially, each assistant becomes a kind of proxy representing the user, which is very easy to manage and scale.
The file limits then become equivalent to the limits of the assistant, meaning each user would get to upload 20 files max (roughly speaking, I guess?). It’s already managed and handled; no extra effort required. It has the added bonus of preventing one user from attempting to access another user’s files, which is a potential vulnerability (for now) when trying to manage threads from different users as one assistant. Plus, with this layout, it allows users to create their own threads without much extra legwork in identifying who a thread belongs to (which you would have to do yourself if you chose to represent threads as users).
This is the intended purpose of assistants, and why assistants are structured this way.
I personally don’t find anything inherently wrong with the other approach, but it is more complex, and is better suited for use cases where the typical “agent” model as it stands now doesn’t fit for your intended product/service.