Reading and summarizing technical documents

gabea · November 8, 2023, 2:39pm

I have been trying to use ChatGPT with the below settings to read a technical document

GPT-4 turbo 1106
Advanced Analytics
The same document in both PDF and HTML formatting

I’ve been trying to get it to distill the document into bullet points that are atomic concepts. It seems to be struggling with the formatting, and not picking up context clues about the format of the document. I took inspiration from SPR. The following prompt is what I’ve been trying. Any suggestions for tweaks?

You are an gifted technical writer.

The goal of the process is to create a bulleted list that contains all of the knowledge from the uploaded document.

Assume that the end consumer of the output has no ability to access the original content, but will need to know all of the knowledge in it.

Each bullet point should be an atomic idea.

Each bullet should be a complete sentence.

Ensure each bullet is a self-contained piece of knowledge and doesn't rely on context of the overall document.

Pay attention to the formatting of the document and draw inferences to the knowledge. There may be a title for each section. The title probably provides additional context for the information underneath it. The sequence of paragraphs and information can also provide additional context.

Filter out bullet points that primarily serve as links to other pages without providing substantive knowledge on their own.

Skip bullet points that describe the audience of the document or its purpose and focus on technical details and instructions that provide clear knowledge or guidance.

When encountering information that is part of one concept, such as a list of items or steps that belong together, we need to ensure that they are captured as a single bullet point to maintain the integrity of the concept.

googcheng · November 8, 2023, 2:47pm

maybe your need is too big for model ? summary just use summerize ?

Fusseldieb · November 8, 2023, 2:49pm

Remove the double newlines, it might decrease the overall quality of the model.
Also, if possible, try to shorten the prompt a bit, put relevant things together and try to keep it short, avoiding redundancies.
If you ask too many things at once, or even abstract things too much, it’ll struggle to answer (quality decrease, fail to obey, etc)

For example, try something along these lines:

Each bulletpoint should be an atomic, self-contained sentence, which shouldn’t rely on context of the overall document.
… etc etc

Also, I’ve observed that the order matters. Put the most important pieces of instructions last, least important first. This is from my own observation - YMMV!

trenton.dambrowitz · November 8, 2023, 3:04pm

I’ve seen greater success putting my most important things last as well, I guess it makes sense if you think of it as fancy autocomplete. You pay more attention to where you left off than where you started.

hollywoodsign · November 8, 2023, 3:25pm

Does anyone have a list of these nuggets?

Topic		Replies	Views
GPT4 gives same generic answers with just technical words substituted for all topics I give. Even when prompts are techicllally detailed Prompting	9	1063	December 20, 2023
Length and structure of output for summaries Prompting gpt-4 , api	1	430	January 30, 2024
GPT4-Turbo doesn't listen to instructions Prompting gpt-4 , gpt-35-turbo , chatgpt	3	1099	January 16, 2024
Summarisation of comments with a prioritisation on more common topics Prompting gpt-4 , chatgpt	1	745	October 30, 2023
How to Ensure Complete Content Retention in GPT-4 Text Rewriting Prompting gpt-4 , gpt-4-turbo	3	897	December 24, 2023

Reading and summarizing technical documents

Related Topics