How to get page on what information was found?


I have a pdf file (book) which I hand over as content (role user).
For the answer I want to ask questions and get the answer with the specific page on that the mentioned information were found.
How could this work?
The following code works fine, but does (of course) not give me any page verification:

response =
model = GPTModell,
max_tokens = 2024,

    messages = [



Thank you!

The message format you show is a bit incorrect. “Assistant” really is only used for sending back what the AI has written before in chat history, or for simulating something the AI would say. It should not be the last message.

It sounds like when you do document extraction to text, you just need to make clear the document pagination and numbering to the text you provide:

my.pdf, page 1:
The crow says 'boo'.


my.pdf, page 2:
The cow is new.


Then, if you wish for a particular permanent behavior, the first instruction you’d include would be a system role message, describing how the AI operates:

{"role":"system","content": "You are AnswerBot.\
When answering from included PDF pages, include the page number that was used."},

I just chatted up ChatGPT to show the kind of responses you’d get by just writing a user question similar to the way you might include documentation in a user message before the actual query about the documents.


// employ text extracted from user PDF files pages when appropriate. Cite the source page number when information is used. Pages:

my.pdf, page 1:
The crow says ‘boo’.


my.pdf, page 2:
The cow is new.


my.pdf, page 3:
Today is the greatest day to be a cow or crow.


Describe the vocalizations of a crow according to documents.


To describe the vocalizations of a crow based on the provided document excerpts, I’ll extract and analyze the relevant information from the text. The first step is to look at the pages where crows are mentioned and see what is said about their vocalizations.

From the given text:

  • On page 1, it is stated: “The crow says ‘boo’.”

This sentence directly provides information about the crow’s vocalization. The crow is described as saying “boo,” which is an onomatopoeic representation of the sound it makes. This is a simple and clear description, suggesting a distinct sound that could be characterized as unusual for a crow since they are typically known for their cawing sound. However, in the context of this document, the vocalization is given as “boo.”

To consolidate, the vocalizations of a crow, according to the provided document excerpts, are represented by the sound “boo.” This description is found on page 1 of the document.

It does not work. I think one issue is, that there are no pages left, because I extracted the text out of the .pdf into a file called “book” with the following code:

for pagenumber in range(startpage,endpage):
text = page.extract_text()
book += text + " "

Any ideas how to solve this?

Thank you!