I would like to upload a file and send it as part of the prompt as an attachment, I tried to send it by text and got limited by the tokens limit.
I saw couple of examples with assistant, but it looks it’s asynchronous ==> You send your file and you wait for it to be processed as part of your knowledge base.
I want to be kind of “instant” to quickly being able to ask questions about the document and extract key information and I don’t know if it’s possible with chat completions endpoint API.
Thank you for your help
Hi there - with GPT-4 turbo which has a 128k token limit you should not run into many token limit issues (unless it is a very large document). For the assistant API you also have the option to upload a file as part of a thread - this would be separate from the files you have uploaded into your knowledge base. You can then query this file. I have implemented this for my assistant and can confirm it works well.
Unfortunately, the document size varies and I’m not sure what the user will upload on the chatbot.
Could you please share with us a snippet of the code for the assistant just to see the flow ?
You cannot directly “attach files” to a chat completion request. You must perform the processing that makes it into text that the AI can understand.
If you are using a document then still larger than the AI model’s available context length, you will have to pursue different paths depending on the desired result:
- If you need to have questions answered, informed by the document, then you will need to chunk and use embeddings to store and retrieve relevant parts from a vector database
- If you need whole document tasks such as summarization, you will need to build the agent-like tasks of summarizing chunks and then summarizing all the summaries, perhaps only performing that by a summary function using other AI.
Those will take more time than saying “hi”, but at least it is under your control where you can report the operations to the user and, unlike assistants, not get errors because there is no file status and the query is performed regardless.