Why OpenAI recruited human contractors to improve GPT-3

… [SNIP]… This is exactly what OpenAI did with GPT-3 recently when it contracted 40 human contractors to help steer the model’s behavior.

The team were given a set of text prompts and asked to write corresponding answers. Engineers at OpenAI collected these responses and fine-tuned GPT-3 on the dataset to show the machine how a human would reply.

The contractors were also asked to rank a list of responses produced by GPT-3 by quality. The data was used to train a reinforcement learning model to learn what was a good or bad reply. The model was then used to calculate a score for possible GPT-3 text generations. Ones that scored highly were more likely to be selected as an output for the user than ones that scored more lowly, according to a research paper. [SOURCE]

1 Like

The question is it’s own answer?