Out of curiosity, why has ‘hallucination’ been chosen as the word to describe these errors? It seems more appropriate to describe them as confabulations, but my background is in neuroscience/neurology rather than computer science. I’m sure I’m missing something simple.
Yes, it would be more accurate to say that AI models, especially language models like GPT-4, confabulate rather than hallucinate. Confabulation refers to the generation of plausible-sounding but potentially inaccurate or fabricated information, which is a common characteristic of AI language models when they produce responses based on limited or incomplete knowledge. This term better captures the nature of AI outputs as it emphasizes the creation of coherent, yet possibly incorrect, information rather than suggesting the experience of sensory perceptions in the absence of external stimuli, as hallucination implies.
I think your idea here is great,
Confabulations are definitely a more accurate term than hallucinations, I think the term hallucinations where chosen because it’s a general, more readily understandable term by must humans. Confabulations are more correct but most people, especially those who are not native speakers, including myself, will have to look it up to know exactly what it means.
Keep in mind that GPT doesn’t have “thoughts” it attempts to produces an output that is most likely gain positive feedback from you, it does this based on probability, if we ask it a very leading question, like this one:
would it be more accurate to say that AI models, especially language models like GPT-4, hallucinate rather than confabulate?
By telling the model “hallucinate rather than confabulate”, we’re telling it what to answer, and it does exactly that:
In the context of AI language models like GPT-4, the term “hallucinate” is sometimes used to describe situations where the model generates text that is not factually accurate or consistent with reality. For example, if a language model generates a news article about a fictitious event, that could be considered a form of “hallucination.” However, the term “confabulate” is typically used to describe situations where a person generates false memories or information in an attempt to fill gaps in their memory. This concept doesn’t really apply to AI models.
Researchers started referring to such undesirable generation as hallucination  1.
The term “hallucination” first appeared in Computer Vision (CV) in Baker and Kanade  and carried more positive meanings, such as superresolution [5, 112], image inpainting , and image synthesizing . Such hallucination is something we take advantage of rather than avoid in CV. Nevertheless, recent works have started to refer to a specific type of error as “hallucination” in image captioning [13, 159] and object detection [4, 83], which denotes non-existing objects detected or localized incorrectly at their expected position. The latter conception is similar to “hallucination” in NLG.
Feel free to run down references such as  and . The footnote is enough for me.
In artificial intelligence (AI), a hallucination or artificial hallucination (also occasionally called confabulation or delusion) is a confident response by an AI that does not seem to be justified by its training data.
This is interesting, based off of the historical context, could the processes that lead to the initial hallucination cases (reporting things that aren’t there in an image, computer vision?) be considered something entirely different than the processes that result in confabulation in LLMs? Would it be useful to separate these terms while we’re trying to understand these errors more deeply?
My personal view is no. Both words have an established meaning. Reusing a word with an established meaning and/or adding another meaning to it just compounds the problem.
Many decades ago I took Latin and learned how many modern words with languages that derived from Latin were created, such as Galaxy. (OK, Galaxy derives from Greek but the idea is the same) Then it was obvious how many words over the century came about. But now more often the art of coining words from Latin bases is fading away and instead people are reusing existing words. One word that is way overused is the word functor that one is a pain to use as a keyword because it is connected to so many things with very different meanings.
What is even worse is one letter operators in programming languages, how many know that the comma (,) is really a binary operator. Most just think of it as a line ending or separator but if you write parsers it really is an operator just as much as (+) is an operator.
It seems to me that the original use of hallucination was originally more objective rather than suggesting it was intrinsically wrong as per this quote.
So under the right conditions it hallucinations are “creative”. But in the context of some prompts that creativity is not welcome. They are just out right wrong or even a lie. And we already recognize this in our own world. “She was creative with the truth”, euphemistically meaning she lied.
So based on that, I think hallucination might be a useful generalized term. Which becomes “creative” when we want fiction or another creative output. And “confabulation” when we are looking for facts.
Don’t be too hard on yourself, even if you had phrased your question in a perfectly neutral way GPT would still have applied different weights to your input according to pre-training data and order of appearance
By the way, when a patient is confabulating, it’s bizarre and fascinating (I consider myself a compassionate doctor, and this isn’t something I play with when I’m interacting with patients, but there are times where you need to legitimately test for it). You’ll hear this incredibly compelling, fabricated story if you just start it off in pretty much any direction. To me, it almost seems to reflect the LLM that we all have (when the processes that ensure/incorporate reason and context aren’t functioning appropriately but the language center is completely spared/preserved).
I like the idea of using a term (even just personally) that makes it clear LLM outputs aren’t really a reliable source for truthful information (yet…). I feel like confabulation is more appropriate semantically (compared to hallucination) for inaccurate or untruthful LLM outputs and that the opposite could be considered true for things related to computer vision. Especially after learning about an original/very early use thanks to the scholar @EricGT .
I think “untruth” would be reasonable but Orwell gave that one a bad wrap
My background is also in neuroscience/neuropsychology. I think it’s more likely that the computer scientists didn’t know that confabulation is exactly the term they were reaching for. Wonder if it ever changes or if this sticks.