Hallucination

What it is

Hallucination in language models is the generation of coherent but factually incorrect or nonsensical content. Models produce tokens that are likely under the learned distribution but don’t correspond to true facts. Hallucinations are particularly problematic for knowledge-intensive tasks (QA, summarization) where factuality is critical. Hallucination is a fundamental limitation, not a bug, arising from the probabilistic nature of language modeling.

[illustrate: Model generating plausible-sounding but false statement; contrasted with factually correct alternative; confidence scores showing model is equally confident in both]

How it works

Root cause: Language models predict likely tokens, not true tokens
- P(token | context) ≠ P(token is true | context)
- Model learns statistical patterns, not world knowledge
Factors increasing hallucination:
- Low or rare data in training (model less confident)
- Out-of-domain questions (no relevant training examples)
- Long generation (errors accumulate)
- Ambiguous or incomplete prompts
Manifestations:
- Invented facts: “The capital of France is Rome” (confident but wrong)
- Made-up citations: “[Smith et al., 2025]” (plausible format, non-existent paper)
- Contradictions within same response
- Confabulation of reasoning steps

Example

# Hallucination example:
Q: "Which novel did Jane Austen publish in 1807?"
A: "Emma" (WRONG - Emma was 1815; actually: *Northanger Abbey*)
Model is confident despite inaccuracy

# Grounded variant (with RAG):
Context: "Jane Austen's novels: Pride and Prejudice (1813), Emma (1815), ..."
Q: "Which novel did Jane Austen publish in 1807?"
A: "I don't find a Jane Austen novel published in 1807 based on provided context."
(Grounding prevents hallucination)

Variants and history

Hallucination is inherent to language models; well-documented with GPT-3/4. Chain-of-thought prompting helps but doesn’t eliminate. Self-critique and Constitutional AI reduce (not eliminate) hallucinations. Retrieval augmentation (RAG) dramatically reduces hallucinations by grounding in external knowledge. Smaller models hallucinate more; larger models hallucinate differently (more convincingly). Open problem: principled detection and mitigation.

When to use it

Be aware of hallucination when:

Deploying models for knowledge-intensive tasks
Users expect factuality (QA, medical, legal advice)
Combining model outputs with real-world actions
Generating long content (error accumulation)

Mitigation strategies:

Use RAG to ground outputs in documents
Ask model for sources and verify
Use fact-checking models post-hoc
Combine with expert review for critical applications
Smaller domains/closed-book systems more reliable