Guidance on summarizing questions from a collection of pdfs

Macha · October 29, 2023, 2:39am

Welcome to the community!

Sounds like you’re using the API? Do you have access to the Advanced Data Analysis plugin?

The approach itself doesn’t look all that bad, but perhaps the way in which you’re amalgamating the data and feeding it to GPT seems to be the problem here. You may be forgetting the “parsing” layer.

If it were up to me, I would store the protocol strings in some kind of dictionary data structure or database. Lots of people have their own methods for this. I’m a tinkerer so I just make h5 files, but it’s more or less up to you.

Remember, these models do have input limits and context limits. In instances like Advanced Data Analysis, you could feed it an entire database, but it cannot handle an extremely long string.

Also, are you embedding the entire string as one embedding? I would recommend parsing the data so that each protocol includes its own vector embedding, and use that for context retrieval. If that’s what you want to execute already, ensure your code or logic structure is actually set up to do that.

You’re very much on the right track, but from what you’ve shown us here, it seems you’re missing an extra parsing layer to feed the information to GPT with. It CAN summarize large sets of data, so your goal is very achievable, you just can’t feed it the elephant all in one prompt. Do it iteratively, or provide a database to allow GPT to summarize each string iteratively itself.

Topic		Replies	Views
Prompt engineering to summarize in a more human like structure Prompting	5	1116	July 9, 2021
Large input summarized text Prompting	3	2234	December 17, 2023
Feedback please: Chatbot to answer questions about long documents API	4	2311	December 17, 2023
Trainining based on complex text API gpt-4 , chatgpt , api	8	1720	July 5, 2023
Summarisation of comments with a prioritisation on more common topics Prompting gpt-4 , chatgpt	1	1507	October 30, 2023

Guidance on summarizing questions from a collection of pdfs

Related topics