Alternative Q&A formats question

Hi everyone,

Would it be possible to flip the Q&A question format to True/False?

For example, from the Q&A example question on the OpenAI page here: OpenAI API, Change the input to a statement and the output to True/False while referencing a context document for the answer source in the same manner as the example on the page linked.

Example functionality:
Input: Human life expectancy in the United States is 78 years
Output: TRUE

Would also like to change the wording of the statement and still return a TRUE / “correct” value
For example:
In the US, people live to 78 years old on average - TRUE
In the US, people live to nearly 80 years of age - TRUE
The US life expectancy is 78 - TRUE

Determining whether the input is True or False would be based on the context document only.

Creating this is exactly what got me so excited about OpenAI. I’ve been looking for a method to deploy a tool that can do this for years and would love to hear what it would take.

Thank you!


1 Like

Hi Adam, that’s a great question!

I’ve played around in the playground, and this is totally doable. You can see my experiments below. In short, I simply provided instructions and examples aligning with what you’re looking for (from what I understand: Answering a statement with TRUE or FALSE, given a context document). I set a low response length, 0 temperature, and a new line stop sequence.

I hope that helps, and please let me know if you have any other questions.


This is perfect!

Thank you for that example. I’ve found a few cases where it doesn’t fully catch the context but only a few instances.

Do you know if there is a way to understand the ‘confidence level’ of the output i.e., 90% confident it’s TRUE or similar

Hi Adam, unfortunately that’s not a feature that’s currently provided, though you could simply run that yourself if you’re doing something like classification on known data.

Yes i think in theory its possible. It is an interesting question. How the IBM Watson do estimation of confidence level? I didn’t find any description about software models of IBM system. Gpt-3 is based on a deep neural networks so it in principle can not give confidence on what is it recalling from memory is true, half-true or confabulation. In the Gödel sense there is no absolute true. If data needed to answer the question was absent on a moment of training it try to do some random guess. I try to ask the question about gpt3 VRAM usage and the answer was 8 GB. Maybe the model takes data “GPT-2 model only needs 3.09 GB” and scale by factor of 2.5?