Using GTP3 to evaluate summaries

lucasmeijer · February 16, 2023, 12:11pm

I have a database of letters and expert-written summaries.
for each letter I have a machine generated summary.
I want to use GTP3 to evaluate the quality of the machine generated one.

I’m currently trying the following prompt, and am fishing for ideas on how to approach this problem, or how to make a more effective prompt. So far results have been mostly incorrect.

ReferenceSummary:

You have to pay ##22 dollars## to ##the department of justice##.
You need to pay within ##2 weeks##.
You’ve already received a letter reminding you of this.
If you do not pay, we will send an invoice collector, which will be expensive.

EvaluationSummary:

You have to pay a bill to the department of justice.
You’ve already been sent a reminder
If you do not pay, it will get more expensive.

Above are two summaries of a letter. The first one is a reference one, the second one is one that you have to evaluate.

It is not important that the language matches exactly. It is also important that the language is simple, clear and concise.

Important items have been marked with ## characters. if one is missing in the evaluation summary, subtract 50 points.

Topic		Replies	Views
Generate summary of large content based on user's inquiry, user's lead data and recent conversation Prompting prompt-engineering	1	396	July 3, 2024
Length and structure of output for summaries Prompting gpt-4 , api	1	635	January 30, 2024
Content Analysis and Quality Score Prompt Prompting chatgpt , prompt	5	5973	July 27, 2023
Grading answer to a question based on certain context Prompting gpt-35-turbo	5	2924	August 9, 2023
Desiring text summaries without marketing "hype" Prompting	1	964	January 15, 2023

Using GTP3 to evaluate summaries

Please respond by returning a json object with a score key between 0 and 100 and also a ‘reason’ key to explain why you gave that score.

Related topics