Chatbot with user provided files: how to let GPT have a "overall" view of the file content?

I can see no use for a command “give me an outline of this file”. Because when chunked into pieces of knowledge, the file no longer exists.

You can increase the answering domain by adding more information to the knowledge of the database. Things like “The list of files used to train the AI: list”, “Knowledgebase article title: xxx, Article summary: yyy” can be instances of knowledge.

Then a question like “do you have any papers that discuss mouse behavior” may have the embeddings that match and return some file summaries.


Consider another way a chunk returned to the AI could allow AI to answer questions:

Data source

Title: Bananas: A deep look
Summary: All about banana cultivars, speciation, growing, harvesting
Download source: mycompany.com/papers/banana.pdf
Page: 6

For Functions

plaintext_parser file location: /files/forage/banana_plaintext/{page}

Data

Musa species are native to tropical Indomalaya and Australia, and are likely to have been first domesticated in New Guinea. They are grown in 135 countries, primarily for their fruit, and to a lesser extent to make fiber, banana wine, and banana beer, and are sometimes even grown as ornamental plants. The world’s largest producers of bananas in 2017 were India and China, which together accounted for approximately 38% of total production. As of 2023, India was producing nearly 30.5 million tons of bananas each year, a little less than 20 million tons more than China.

1 Like