Issues with Accessing PDF Documentation

platypus · October 5, 2024, 9:36am

Regarding referencing precise pages, I created a topic about this a while back, you can check it out here.

It’s not the issue with the model, it has to do with how the PDFs are ingested and represented behind the scenes.

The documents are loaded as text by default, and chunked up (normally a few sentences at a time), and indexed using text embeddings (vector representations). Page and section information is therefore lost.

In the topic I linked above I suggested a few potential ways of improving it, but it’s still not guaranteed to work 100% of the time.

Topic		Replies	Views
Custom GPT will not read full document in first try Plugins / Actions builders gpt-4 , custom-gpts	4	1922	January 20, 2024
Obtaining correct PDF page number in the response using GPTs Prompting gpt-4 , gpts	13	5183	October 8, 2025
What are the limitations of GPT-4 in analyzing PDF text? Prompting gpt-4	7	34531	December 28, 2025
Speeding up knowledge base searches in build a GPT? Plugins / Actions builders gpt-4 , chatgpt , gpt-builder	1	1480	November 14, 2023
My GPT - Knowledge base - Best practices GPT builders	8	24053	December 28, 2025

Issues with Accessing PDF Documentation

Related topics