By Theresa White, Director, Marketing and Growth Enablement
AI in Plain English Campaign
Every student has been there—the teacher’s eyes lock on you. They ask a question that makes your stomach sink and heart rate quicken—because you really don’t know the answer. You could say “I don’t know (IDK),” but that won’t look good. Instead, with a slight stammer, you begin to make something up, hoping to fake it well enough to get off the hook.
That, in essence, is what a large language model (LLM) does when it "hallucinates." These powerful AI tools, trained on vast amounts of text data, are designed to generate human-like responses. However, just like that nervous student, they sometimes present fabricated information as fact instead of admitting “I don’t know.”
I’ve used generative AI and large language models for ideation, drafting content, and research both personally and professionally. Depending upon the query, the information returned can be surprising. Sometimes it sounds accurate but attempts to find the source material turn up empty. Other times, the returned data is what I’ll call “exceptionally creative.” These experiences piqued my curiosity about this concept of hallucinations, and I wanted to learn more.
So, I prompted some large language models to explain themselves. Of course, I followed up by reading recent research and related publications. I wanted to learn how to minimize the likelihood that I would be misled by hallucinations in the future.
Understanding the Roots of Hallucinations
First, I wanted to understand why LLMs hallucinate.
- Training Data Gaps: LLMs learn from massive datasets, but these datasets aren't perfect. They may contain biases, inaccuracies, or simply lack information on specific topics.
- Statistical Patterns vs. True Understanding: LLMs excel at recognizing patterns in language, but they don't possess genuine understanding or reasoning abilities. They might string together words and phrases that sound plausible even if they're factually incorrect.
- Response Priority Bias: LLMs are designed to generate responses. When faced with a question, they prioritize providing an answer, even if they have to invent details to fill the gaps. Thus, it’s hard for you, as the consumer, to discern fact from fiction.
How Big Is the Problem?
So, how prevalent is the hallucination issue? I asked a few of the commercially available LLMs exactly that question. Unsurprisingly, they couldn’t give me a direct numerical answer.
However, in early March of 2025, research firm AIMultiple published a benchmark study of nine different LLMs to measure their hallucination rates. Asking 60 questions to each LLM, they calculated a hallucination rate of 60% at the highest and 15% at the lowest. These numbers alone gave me pause.
Statistics are one thing, but impact is another. Consider this real-world example:
In its report on the state of AI in the legal field, Stanford University described a case involving an attorney who used a commercial LLM to generate a legal brief. The LLM hallucinated several legal cases, providing citations for nonexistent court decisions. As a result, the attorney faced both sanctions and reputational damage.
Fortunately, there are actions we as LLM users can take to mitigate the risk associated with hallucinations.
Strategies for Reducing Hallucinations
While we can't eliminate hallucinations entirely, we can adopt strategies to mitigate their impact:
- Treat LLM responses with healthy skepticism. Don't assume everything they say is true.
- Be specific and clear in your prompts. The more context you provide, the better the LLM can understand your request.
- Ask the LLM to provide sources or justifications for its claims. Use phrases like "Are you sure?" or "Can you double-check?" to encourage the LLM to verify its information.
- Explicitly ask for the LLM to state when it does not know. Prompt the LLM, "If you do not know the answer, please state that you do not know.
- If you receive a suspicious response, rephrase your prompt or ask follow-up questions to clarify the information.
- Use multiple LLMs to compare responses and identify potential discrepancies.
- Use LLM output as a starting point, not a final product. Don't rely solely on LLMs for critical decision-making. Use your own judgement.
Will LLM Accuracy Improve?
Experts are working to improve LLM reliability through techniques that improve results including fact-checking mechanisms such as Full-Context Retrieval and Verification (FCRV) and Retrieval Augmented Generation (RAG). As LLMs continue to evolve, expect significant progress in reducing hallucinations. However, until then, we as users must remain vigilant.
So, LLMs, my message to you: I’m the human with the reasoning power in this relationship. It’s A-OK if your answer is IDK.