Prime Zero Shot Learners towards Factfulness - GPT3 vs. GPT-J-6B

I made a video on simple techniques to nudge ZSL toward facts. Anyone got more suggestions?


quite excited to experimenting with it because it’s been trained on very recent data which is higher quality vs GPT3. The PILE dataset.


it’s not that they chose it. it’s a convention (written or unwritten) among language models. Look only at huggingface models and you seek, top p, top k, temp, tokens, and other parameters across the board.


Great experimentation and a much-requested direction.

Detail but if you want to validate truthfulness, I think you should also cap top p and generally speaking, it may be better to experiment with to introduce stochasticity than temperature levels (beyond a limit). I think it also makes more sense to sample the first item in the list rather than multiple items in one go. As you yourself noted, it degenerates.

Do we think this is a good test though? Something close to the title you used is likely to appear in hundreds of pseudoscientific sources which the models have been trained on. Without context, you are not really asking it to predict what is factually true, but what humans tend to claim is scientific vetted; something I would not trust in relation to diets.

I think to test factfulness, we probably should have more controlled tests (e.g. a unique question that requires combining facts) as well as experiment with prompts that should indicate factual answers?

1 Like