Hallucinating AI in Court: When the Machine Gives Bad Advice

The Machine Lies — and the Lawyer Signs

In June 2023, a case swept through the international legal press that profoundly shook the relationship between artificial intelligence and legal practice: In the New York case Mata v. Avianca, Inc., a lawyer had included six court decisions in his brief that did not exist. ChatGPT had fabricated them — complete with case numbers, judges' names, and quotations. The court imposed a fine of USD 5,000 and required the lawyers to send personal apology letters to the judges falsely identified as authors.

What sounds like an isolated incident has long since become a pattern. Since 2023, nearly 600 cases have been documented in the United States in which lawyers submitted AI-generated false citations to courts — with sanctions ranging from USD 100 to over USD 31,000. Warning signs are also mounting in Germany. But what exactly are AI hallucinations, why are they so dangerous for legal practice, and what professional duties are at stake?

What Are AI Hallucinations?

The term hallucination in AI research describes a phenomenon where a language model (Large Language Model, LLM) generates information that sounds plausible but is factually incorrect. In the legal context, this manifests in three typical forms:

Fabricated Decisions: The model invents complete rulings — with fictitious case numbers, judges' names, and headnotes. This was the central problem in Mata v. Avianca.

False Quotations from Real Decisions: The model cites a decision that actually exists but misrepresents its content — for instance, attributing a legal opinion the court never actually held.

Correct Citations with False Reasoning: The model provides a correct citation but connects it with legal reasoning that does not follow from the cited decision.

The Scale of the Problem

A widely noted study by Stanford University from 2024 examined the reliability of common language models on legal questions. The results were sobering: GPT-4 hallucinated in 58 percent of cases, GPT-3.5 in 69 percent, and Llama 2 in 88 percent. Even specialized legal AI tools performed only marginally better in a 2025 follow-up study: Lexis+ AI produced incorrect information in over 17 percent of cases, Westlaw AI in approximately 33 percent.

These figures illustrate: AI hallucinations are not a marginal problem but a structural risk in the use of AI in legal practice.

Spectacular Cases: From New York to Alabama

Mata v. Avianca (New York, 2023)

The originating case involved a personal injury claim against the airline Avianca. Lawyers Peter LoDuca and Steven Schwartz of Levidow, Levidow & Oberman had used ChatGPT for legal research. When opposing counsel reported being unable to locate the cited decisions, the lawyers did not withdraw their submissions. Judge P. Kevin Castel described an "unprecedented circumstance" and imposed the sanction. The case became a worldwide warning signal.

Pennsylvania (2025)

Attorney Nicholas L. Palazzo was sanctioned after his briefs in a product liability case contained erroneous and apparently fabricated citations. The referenced decisions either did not exist or contained significant factual errors.

Alabama (2025)

Three attorneys from the large firm Butler Snow LLP — including Matthew B. Reeves, William J. Cranford, and William R. Lunsford — received a public reprimand and were disqualified from the case. A report was also filed with the Alabama State Bar for fabricated legal authorities.

Illinois (2025)

A Springfield attorney was penalized for citing eight non-existent decisions in appellate proceedings. In a separate case, attorneys for the Chicago Housing Authority cited the fictitious case "Mack v. Anderson" — the responsible attorney stated she did not believe ChatGPT was capable of generating false precedents.

Professional Consequences Under German Law

Fundamental Duties Under § 43 and § 43a BRAO

The German Federal Lawyers' Act (Bundesrechtsanwaltsordnung, BRAO) establishes clear requirements for attorney diligence. § 43 BRAO sets out the general professional duty: lawyers must practice their profession conscientiously (gewissenhaft). § 43a BRAO specifies the fundamental duties — including the duty of independence (Unabhängigkeit), confidentiality (Verschwiegenheit), and conscientious professional practice (gewissenhafte Berufsausübung).

What does this mean for AI use? Conscientiousness requires the lawyer to verify the accuracy of their work products — regardless of what tools they employ. Anyone who passes on AI-generated legal advice or briefs to clients or files them with courts without verification violates this fundamental duty.

Liability Under § 280 BGB

Beyond professional sanctions, civil liability looms. The attorney-client agreement establishes a duty of diligent legal advice. If a lawyer provides legal advice based on AI hallucinations that proves to be incorrect and causes damage to the client, they are liable for damages under § 280(1) BGB (Schadensersatz wegen Pflichtverletzung — damages for breach of duty).

The defence that one "relied on" the AI provides no relief. The duty of independent verification is non-delegable — neither to a legal trainee nor to an algorithm.

Disciplinary Measures Under § 113 BRAO

In cases of serious breaches of duty, disciplinary measures under § 113 BRAO also come into play. These range from a warning (Warnung) through fines (Geldbuße) to a ban on representation (Vertretungsverbot) and, in the most extreme cases, disbarment (Ausschließung) from the legal profession.

Why Do AI Models Hallucinate?

Understanding the technical causes is crucial for the responsible use of AI tools.

Statistical Word Prediction Rather Than Knowledge Representation

Large language models are fundamentally statistical text generators. They calculate, based on their training data, the probability of the next token (word or word fragment). They do not "understand" legal content — they generate texts that statistically resemble legal argumentation. When the model cannot find a suitable real decision in its training data, it generates one that sounds plausible.

Lack of Source Verification

Unlike legal databases such as juris, beck-online, or dejure.org, general language models have no access to verified case law databases. They cannot distinguish between existing and fabricated decisions. Retrieval-Augmented Generation (RAG) systems, as used by specialized legal tools, mitigate this problem — but do not eliminate it entirely, as the Stanford study demonstrates.

Self-Confirmation and Confabulation

Particularly insidious: when a user asks an AI about the existence of a hallucinated decision, the model frequently confirms its existence and even provides further "details." This phenomenon, known as confabulation, makes it difficult to detect hallucinations through mere follow-up questioning.

Lessons for the German Legal Profession

1. AI Is a Research Tool, Not a Legal Source

The fundamental insight is: AI models are not reliable legal sources. They can provide starting points for research, highlight connections, and offer drafting assistance. But every piece of legal information must be verified against primary sources — statutes, case law, commentaries.

2. Four-Eyes Principle for AI-Assisted Work

Establish a binding four-eyes principle (Vieraugenprinzip) in your firm for AI-assisted work products. No brief, no legal opinion, and no contract draft should leave the firm without a second person — an experienced lawyer — having verified the accuracy of the cited sources.

3. Documentation of AI Use

Document when and how AI tools are used. This serves not only internal quality assurance but can also facilitate demonstrating, in the event of a liability dispute, that appropriate review processes were established.

4. Training and Awareness

Invest in the AI literacy of your staff. Every lawyer who uses AI must understand how the technology works and where its limits lie — particularly the hallucination phenomenon.

5. Transparency Towards Clients and Courts

Communicate openly when AI tools are used in client work. Several US courts already require disclosure of AI use in filings. It is only a matter of time before similar requirements are discussed in Germany.

The Regulatory Outlook

EU AI Act and Legal Practice

The EU AI Act enters into force in its central provisions on 2 August 2026. AI systems deployed in the administration of justice — for example, for access to justice or the interpretation of statutory texts — may be classified as high-risk AI. This would entail strict requirements regarding transparency, documentation, and human oversight.

Professional Regulatory Developments

In the United States, the bar associations of New York, California, and Florida have already published guidelines on AI use. German bar associations will need to follow this example. Clear professional regulation of AI use in legal practice is overdue.

Conclusion: Trust Is Good, Verification Is Mandatory

AI hallucinations are not a software bug that will be fixed in the next version. They are a structural feature of generative language models. The Stanford studies demonstrate that even specialized legal AI tools hallucinate to a significant extent. Those who ignore this endanger their clients, their reputation, and their licence to practise.

The answer lies not in rejecting AI — it lies in its responsible use. AI can be a powerful tool for legal research, argument structuring, and efficiency gains. But it replaces neither legal expertise nor the attorney's duty of care. The lawyer remains the guarantor of accuracy — not the algorithm.

At compleneo, we support you in developing firm-internal guidelines for AI use, in training your teams, and in the legal assessment of AI tools for legal practice. Get in touch with us.

Hallucinating AI in Court : When the Machine Gives Bad Advice