What's best for Assistant with long files attached?

brodanoel · June 26, 2025, 7:19am

Hello everyone!
I need to create an assistant which should answer questions from a PDF file who has 465 pages.

First question:
ChatGPT is recommending me using GPT-4.1 as model. Would that be OK? as far as I see, it makes sense.

Now, second question:
What would be better? Attaching the whole big file, or splitting the PDF on 10 different files with 46 pages each? (lucky for me, the PDF is actually a combination of 10 big chapters, so, it would be a good semantic split with no losing meaning/context).

Just to clarify: The file is static. it won’t change. I’ll create the Assistant from the UI, attach the file, and that’s all. Then I’ll query from the API (I am actually already doing all this, with the big file).

Last 3rd bonus question: I need to reduce error at the minimum and force the Assistant to always “copy and paste” the answer from the file, if the answer is on the file. Question: Will be a temperature:0.01 and a top_p:0.30 good values? any other recommendation there?

Thanks!

aprendendo.next · June 26, 2025, 12:01pm

It is ok, and if you need to look out for costs you can also try gpt-4.1-mini

I would say the best would be to create a vector store in this case, to avoid having to upload the PDF multiple times or handling which PDF you will use. Then you can refer to it as only a vector store ID. The cookbook has a complete example on how to create and use a vector store.

An alternative would be to use the vector store API directly to retrieve the literal semantic matches, then prompt engineer your way into making it extract the question as literal as you want from the chunk it returns.

When using it as a tool you might not see what it is really retrieving and using behind the scenes, which could either be the completely wrong chunk or forcefully changing the format before answering.

Topic		Replies	Views
Efficiently Interacting with super super Long PDFs/documents API gpt-4	2	1563	June 25, 2024
Creating an AI Assistant with OpenAI API: How to Upload Files for Knowledge Base? API gpt-4 , chatgpt , assistants-api	5	9651	June 6, 2024
How can I make the assistant 'read' scanned documents that are in PDF format? API assistants-api , file-uploads	3	493	June 2, 2025
File search and token usage (assistant) API gpt-4 , chatgpt , assistants-api	5	610	February 11, 2025
Assistant with PDF in Vector Store delivers worse answers than ChatGPT with same PDF API assistants-api	2	87	March 21, 2025

What's best for Assistant with long files attached?

Related topics