How to improve quality of answers in Question Answering

nelson · June 4, 2021, 4:40am

Given a summary of about 1000 tokens, what is the best way to validate if the answer generated from Question Answering are correct.
We ran a test by submitting about 5000 prompts on davinci, requesting an answer for a question relative to the summary. We find that about 78% of the answers coming acceptable while others are either wrong or irrelevant. We use semantics search to filter out the irrelevant answers and still has about 8% of answers are relevant but incorrect.

Any suggestions on what other processing we can do to filter out remaining 8% incorrect answers with human validation?

luke · June 5, 2021, 12:48am

Hey Nelson! To make sure I’m understanding, is the goal to generate an answer to a question based on information in a summary?

nelson · June 5, 2021, 4:01am

Hi Luke, thanks for responding. That is correct.
If you will be kind enough to try this example, you will see what I am talking about…

{
  "prompt": "Santa Claus, also known as Father Christmas, Saint Nicholas, Saint Nick, Kris Kringle, or simply Santa, is a legendary character originating in Western Christian culture who is said to bring gifts on Christmas Eve of toys and candy to well-behaved children, and either coal or nothing to naughty children. He is said to accomplish this with the aid of Christmas elves, who make the toys in his workshop at the North Pole, and flying reindeer who pull his sleigh through the air.\nThe modern character of Santa Claus was based on traditions surrounding the historical Saint Nicholas (a fourth-century Greek bishop and gift-giver of Myra), the English figure of Father Christmas, and the Dutch figure of Sinterklaas (also based on Saint Nicholas).\nSanta Claus is generally depicted as a portly, jolly, white-bearded man, often with spectacles, wearing a red coat with white fur collar and cuffs, white-fur-cuffed red trousers, red hat with white fur, and black leather belt and boots, carrying a bag full of gifts for children. He is commonly portrayed as laughing in a way that sounds like \"ho ho ho\". This image became popular in the United States and Canada in the 19th century due to the significant influence of the 1823 poem \"A Visit from St. Nicholas\". Caricaturist and political cartoonist Thomas Nast also played a role in the creation of Santa's image. This image has been maintained and reinforced through song, radio, television, children's books, films, and advertising. \n\nQ: Was Thomas Nast an advertiser?\nA:",
  "temperature": 0,
  "max_tokens": 100,
  "top_p": 0.25,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "stop": ["\n"]
}

nelson · June 5, 2021, 5:38am

The answer I got back was “Yes, he was” which is wrong.
I think the main problem is that Thomas Nast played a role in creating Santa’s image, and the image was used for advertising.
It’s understandable that GPT3 won’t get every answer correct, but if I can do another fact check and remove incorrect answers, then that will resolve my problem.

Alan · June 5, 2021, 2:59pm

nelson:

{
  "prompt": "Santa Claus, also known as Father Christmas, Saint Nicholas, Saint Nick, Kris Kringle, or simply Santa, is a legendary character originating in Western Christian culture who is said to bring gifts on Christmas Eve of toys and candy to well-behaved children, and either coal or nothing to naughty children. He is said to accomplish this with the aid of Christmas elves, who make the toys in his workshop at the North Pole, and flying reindeer who pull his sleigh through the air.\nThe modern character of Santa Claus was based on traditions surrounding the historical Saint Nicholas (a fourth-century Greek bishop and gift-giver of Myra), the English figure of Father Christmas, and the Dutch figure of Sinterklaas (also based on Saint Nicholas).\nSanta Claus is generally depicted as a portly, jolly, white-bearded man, often with spectacles, wearing a red coat with white fur collar and cuffs, white-fur-cuffed red trousers, red hat with white fur, and black leather belt and boots, carrying a bag full of gifts for children. He is commonly portrayed as laughing in a way that sounds like \"ho ho ho\". This image became popular in the United States and Canada in the 19th century due to the significant influence of the 1823 poem \"A Visit from St. Nicholas\". Caricaturist and political cartoonist Thomas Nast also played a role in the creation of Santa's image. This image has been maintained and reinforced through song, radio, television, children's books, films, and advertising. \n\nQ: Was Thomas Nast an advertiser?\nA:",
  "temperature": 0,
  "max_tokens": 100,
  "top_p": 0.25,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "stop": ["\n"]
}

Facts are always based on assumptions, and so in a way, you’re getting GPT-3’s version of the facts which is really always going to be an opinion. So in cases like this, it’s useful to probe how GPT-3 arrived at that answer. As usual, the argument it uses is quite an interesting one, and I’m willing to buy GPT-3’s version of the facts in this case… Here’s what I got at the settings you used when I went on to ask it about the answer:
Q: Was Thomas Nast an advertiser?
A: Yes, he was.

Q: Why was Thomas Nast an advertiser?
A: Because he was a cartoonist.

Q: Are all cartoonists advertisers?
A: No, not all.

Q: Then what makes a cartoonist and advertiser?
A: The desire to make money.

Q: So is every cartoonist that gets paid for their work is an advertiser? Or are they only an advertiser if their work is trying to sell a particular product or service?
A: It’s the latter.

Q: What particular product or service was Thomas Nast trying to sell?
A: Santa Claus.

Q: It sounds like you consider Santa Claus to be product or service? Why is Santa a product or service?
A: Because he’s a brand.

Alan · June 6, 2021, 12:24pm

I liked the “santa claus is a brand” …I’d never thought of it quite that way and sadly realize its true.

luke · June 7, 2021, 9:24pm

Which model are you using? I find davinci instruct is best for these sorts of ‘closed domain’ Q&A tasks where all of the relevant information required to answer the question is in the prompt. I’ve had some success with this example, although I haven’t tested rigorously: OpenAI API.

Let me know how it goes!

nelson · June 8, 2021, 10:17am

Hi Luke,
Thanks for providing the sample for closed domain.
I really like how it reply “I don’t know” when the probability is not high enough.
I’ll do further testing on this, much appreciated.

sai.mandyala · August 31, 2022, 11:15pm

Hi Nelson,

I’m working on something similar to this, you can go through this if you are dealing with a large document set to provide the contexts https://github.com/openai/openai-cookbook/tree/main/examples/fine-tuned_qa

Also Open AI can answer a lot of questions based on how creatively the prompt can be engineered, so do go through the prompt engineering topic in the forum

nelson · December 16, 2022, 1:57am

Hi Sai,

Thanks for the information. I do fine the latest text-davinci-003 do a much better job on this now.
Great work by the OpenAI team.

Topic		Replies	Views
What to do when fine-tuning is not working? API	21	8097	December 24, 2023
Fine Tuned Chatbot forgets how to output summary of conversation API	9	1858	December 18, 2023
Is it possible to fine-tune a model to answer questions given a raw text? Prompting	18	10240	December 15, 2023
Fine-tuning problem API	4	2074	December 19, 2022
How to get GPT-3 to stick to prompt input? Prompting	12	2750	November 23, 2022

How to improve quality of answers in Question Answering

Related topics