Assistant V2 files. Size limit

I see from here ( dated a while ago) In Assistants, Is there a limit to File size and number of files? - #5 by tyagi.shubham177
“You can attach a maximum of 20 files per Assistant, and they can be at most 512 MB each.”

I have tried to upload to the vector store some PDF’s of which 2 are 450 meg & 350 meg.

The 2 larger files show in the vector store but marked ‘Failed’

Where can I find documentation on file limits ?

Any help appreciated.

How many tokens are present in each file?

See:

File Search

The maximum file size is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).

Hi!

As @elm points out file size and token count are the main factors to consider. In this case it’s a bit tricky because the base solution to count tokens expects a text string as input.

Here is a repo, and I am not affiliated with this in any way, to get the token counts for PDF files based on the tiktokken library.

In general you can expect PDF files to cause additional issues with retrieval. If you can provide the knowledge input as a text file your results will likely be better and you can check the token counts yourself using tiktokken.

Thanks for the info. I’ll try splitting the pdfs in half.

It appears that there is an update to the mx number of files that can be used in file search, it is now 10,000 as per the below reference.

https://platform.openai.com/docs/assistants/how-it-works/creating-assistants#:~:text=You%20can%20attach%20a%20maximum%20of%2020%20files%20to%20code_interpreter%20and%2010%2C000%20files%20to%20file_search%20(using%20vector_store%20objects).

1 Like