Dent
January 13, 2023, 6:01pm
3
The OpenAI cookbook has some good tips on getting more reliable solutions. For multi-step math problems, prepend the response with “Let’s think step by step:” and accuracy scores shoot up from 18% to a decent 79%!
# Techniques to improve reliability
When GPT-3 fails on a task, what should you do?
- Search for a better prompt that elicits more reliable answers?
- Invest in thousands of examples to fine-tune a custom model?
- Assume the model is incapable of the task, and move on?
There is no simple answer - it depends. However, if your task involves logical reasoning or complexity, consider trying the techniques in this article to build more reliable, high-performing prompts.
## Why GPT-3 fails on complex tasks
If you were asked to multiply 13 by 17, would the answer pop immediately into your mind? For most of us, probably not. Yet, that doesn't mean humans are incapable of two-digit multiplication. With a few seconds, and some pen and paper, it's not too taxing to work out that 13 x 17 = 130 + 70 + 21 = 221.
Similarly, if you give GPT-3 a task that's too complex to do in the time it takes to calculate its next token, it may confabulate an incorrect guess. Yet, akin to humans, that doesn't necessarily mean the model is incapable of the task. With some time and space to reason things out, the model still may be able to answer reliably.
As an example, if you ask `text-davinci-002` the following math problem about juggling balls, it answers incorrectly:
```text-davinci-002
Q: A juggler has 16 balls. Half of the balls are golf balls and half of the golf balls are blue. How many blue golf balls are there?
This file has been truncated. show original
1 Like