Please Select Your Location
Australia
Österreich
België
Canada
Canada - Français
中国
Česká republika
Denmark
Deutschland
France
HongKong
Iceland
Ireland
Italia
日本
Korea
Latvija
Lietuva
Lëtzebuerg
Malta
المملكة العربية السعودية (Arabic)
Nederland
New Zealand
Norge
Polska
Portugal
Russia
Saudi Arabia
Southeast Asia
Suisse
Suomi
Sverige
台灣
Ukraine
United Kingdom
United States
Please Select Your Location
België
Česká republika
Denmark
Iceland
Ireland
Italia
Latvija
Lietuva
Lëtzebuerg
Malta
Nederland
Norge
Polska
Portugal
Suisse
Suomi
Sverige
<< Back to Blog

AI Hallucinations: What Exactly Are They? Could AI Chatbots Merely Be Daydreaming?

VIVE POST-WAVE Team • Jan. 11, 2024

5-minute read

At the end of last year (12/27), The New York Times filed a lawsuit against OpenAI and Microsoft, accusing the two companies of using its published content to train their language model for their AI chatbot. The lawsuit also mentioned the emergence of "AI hallucinations" that could damage its brand reputation — for example, when answering a question, ChatGPT or Bing Chat can sometimes return serious nonsense but then attribute the source of the nonsense to The New York Times.

The New York Times is not alone in its concerns.

If AI Deceived You

The Cambridge Dictionary's Word of the Year for 2023 is "Hallucinate," stemming from the craze for large language models (LLMs) like ChatGPT. The dictionary notes, "When an artificial intelligence (= a computer system that has some of the qualities that the human brain has, such as the ability to produce language in a way that seems human) hallucinates, it produces false information."

Famous examples include: Google's Bard made a mistake during its debut by attributing "the first photo of an exoplanet" to the Webb Space Telescope (the correct answer is the Very Large Telescope of the European Southern Observatory); last May in New York, a lawyer used ChatGPT to draft court documents that contained false case references and, as a result, faced sanctions from the court.


In my own testing, when I asked GPT 3.5 to name a few AI experts on AI hallucinations, four out of five listed in ChatGPT’s response were made up, with only the third expert,Ian Goodfellow, being a real person.

Please GPT 3.5 cite what several artificial intelligence experts have said about AI hallucination, and the result is AI hallucination
Please GPT 3.5 cite what several artificial intelligence experts have said about AI hallucination, and the result is AI hallucination.GPT 3.5 explains this to illustrate the concept of AI illusion. Is it a kind of negative that makes a positiveGPT 3.5 explains this to illustrate the concept of AI illusion. Is it a kind of negative that makes a positive?

In fact, according to a Vectara survey, all major language models currently have "hallucination" issues. As shown in the table below, GPT 4 has the lowest "hallucination rate" at 3%, while Google Palm 2 Chat is as high as 27.2% (notably, it also gives the longest answers).

Vectara's statistics on the hallucination rates of major language modelsVectara's statistics on the hallucination rates of major language models.

The Ire of a Linguistics Master

The widespread AI hallucinations remind us of the 95-year-old linguistics legend Noam Chomsky's letter to The New York Times in March last year: The False Promise of ChatGPT.

In the article, Chomsky harshly criticizes large language models, arguing that they betray the essence of language and produce nothing but falsehoods, mediocrity, and evil — even using Hannah Arendt's concept of "the banality of evil" to attack ChatGPT, showing his anger.

Chomsky insists that the value of human language lies in the ability to explain with minimal information, while large language models merely describe and predict text, lacking counterfactual thinking and moral reasoning — counterfactual thinking (imagining and deducing different situations from the facts) expands our thinking based on existing clues, while morality tells us that seemingly infinite thought is still limited by worldly principles.

Chomsky's examples include: "Here’s an example. Suppose you are holding an apple in your hand. Now you let the apple go. You observe the result and say, 'The apple falls.' That is a description. A prediction might have been the statement 'The apple will fall if I open my hand.' Both are valuable, and both can be correct. But an explanation is something more: It includes not only descriptions and predictions but also counterfactual conjectures like 'Any such object would fall,' plus the additional clause 'because of the force of gravity' or 'because of the curvature of space-time' or whatever. That is a causal explanation: 'The apple would not have fallen but for the force of gravity.' That is thinking."

And, "In 2016, for example, Microsoft’s Tay chatbot (a precursor to ChatGPT) flooded the internet with misogynistic and racist content, having been polluted by online trolls who filled it with offensive training data.” In the article, Chomsky disdainfully believes that the language predictions of large language models are always dubious and superficial.

Why would AI spout nonsense?

From another perspective, things might be entirely different, and hallucinations might not be hallucinations at all.

Hallucinations Are How They Operate

Andrej Karpathy, a founding member of OpenAI, might help us get some clarity on what these so-called hallucinations are with his musings on X (Twitter) at the end of last year, which also carried a bit of anger. He believes that the essence of large language models is dreaming, which is not a problem but the way they operate; it's not a flaw but a feature.

Andrej Karpathy compares the operation of large language models to dreaming. "We direct their dreams with prompts. The prompts start the dream, and based on the LLM's hazy recollection of its training documents, most of the time the result goes someplace useful. It's only when the dreams go into deemed factually incorrect territory that we label it a 'hallucination'. It looks like a bug, but it's just the LLM doing what it always does."

He also contrasts with search engines: search engines do not dream at all, they find information based on input data, without any hallucinations, but also without the ability to generate content. Should we complain that search engines have a 'lack of creativity' problem? (Although Karpathy did not actually pose this rhetorical question, the implications are strong.)

Despite this, Andrej Karpathy's slightly angry short article still gives everyone (who?) an out: he differentiates between "large language model assistants" (like ChatGPT) and "large language models" themselves, and says he does indeed realize that the hallucinations people generally discuss refer to the former. He also suggests several improvements, such as Retrieval-Augmented Generation (RAG); comparing multiple different responses generated by the model to find contradictions or inconsistencies; allowing the model to reflect on its response process and establish verification steps to check the information it generates; and evaluating the correctness of specific outputs based on the neural network activations of the model (such as AI learning patterns).

From Chomsky to Karpathy, we can see that the former thinks from the position of a linguist and raises criticism, while the latter responds and defends from the practical operational level; Chomsky shows us the true essence and spirit of human language and thinking (such as having counterfactual thinking and moral principles), Karpathy helps the general public understand the true nature of large language models — dreams recollection correspond to their way of operation, and our own vague memories correspond to their training data sets.

When we understand that AI hallucinations are actually a normal part of an AI's routine, maybe we should instead ask: if AI is still dreaming, purely generating responses that are sometimes accurate based on its pool of vague data, what will happen when it wakes up? Will it be the moment of convergence? Is that when a 'superintelligence' will reign over humanity? Will that be the birth of a consciousness beyond humanity?

As for AI Hallucinations... no matter the era, the distinction between truth and falsehood still depends on our own ability to discern, even if we are seemingly in control of reality. As the manager of the Cambridge Dictionary, Wendalyn Nichols, said: "The fact that AIs can 'hallucinate' reminds us that humans still need to bring their critical thinking skills to the use of these tools."