Can you stop it from making up sources?

A Twitter user asked whether GPT3 can be trained not to make up references.

Educators are using fake references as a way to detect AI-generated prose–as an educator I don’t want to lose this technique, but it’s pretty clear it will be ephemeral.

My guess is that it would help to give GPT3 examples of cases where no references were available with completions where none are offered.

Apologies if I missed a thread on this in my search.

1 Like

This question is a key research question for all LLM. To be more specific, this problem is called “confabulation.” Confabulation is a phenomenon defined as:

Confabulation is the creation of false memories in the absence of intentions of deception. Individuals who confabulate have no recognition that the information being relayed to others is fabricated.

When we examine conditions that cause confabulation in humans (brain injury or brain disease) it becomes clear that GPT-3 and other LLM are quite simply cognitively incomplete systems. What I mean by this is that GPT-3 has no theory of mind for itself. It does not know what it knows and what it does not know. Human brains have specific signals and brainwaves that appear when we know something vs when we don’t. In other words, our architecture is designed to detect factual consistency.

It’s important to remember that GPT-3 was trained on internet data, which generally will represent conventional human perspectives, albeit sometimes they are vastly different perspectives. Then it was also trained on fictional sources, like Gutenberg. As such, GPT-3’s sense of reality is dubious at best. It never had a body with which it could test its understanding of reality (and differentiate between reality and imagination) in the pedagogical way that humans do.

Essentially, for now, confabulation is something we just have to tolerate.

2 Likes

Using strict instructions should help. Somewhat similarly, I’m building a Q&A where I only want answers based on the provided sources, not based on any other outside data. I instruct GPT-3 to limit itself to those sources and nothing else, and it works pretty well. Re spotting fake references, that’s a great product idea: build a system that allows teachers (and others) to run an essay through GPT-3 to check whether references are accurate. You might connect with the people building Elicit as they would care about this issue. https://elicit.org/search

1 Like

Awesome–thank you! I just signed up for Elicit and sent them a query about this. Good to know it’s possible to limit GPT-3 to certain sources.

Ah, it helps so much to understand the context for the answer here and the missing link in theory of mind. So it has no way to stop itself or recognize when its own training data is scanty for a particular completion. I think I need to read your book at some point after I learn a few more basics first!

1 Like

Here’s a video I made that addresses one method, not sure if you’ve seen it:

Basically, you can split it up into 2 tasks: (1) Do I actually know this information? Y/N (2) If I have the information, what is the answer?

This video shows how I finetuned both these cognitive tasks into a single step.

2 Likes