Asisstants API accepts a <512MB file. It has the tool myfiles_browser, which has a function open_url. When you ask a summarization of a file, it calls open_url and process it
However, 512MB ~ 128M tokens is 1k times larger than the context window of the largest capacity model, GPT-4-turbo, 128k tokens.
Does it call a hidden function to summarize recursively, or do anything else?
My understanding of the upload of files is that it was basically a RAG, is it not? If it is, surely there is some fragmentation of the file to associate each fragment to a vector. But no idea of how it would help to build a summary.
It is ‘chunking’ your file and passing each chunk and storing it into a vector database. By breaking down the the file it keeps the token size down, enabling it to handle more tokens in aggregate