AI AdvisoryJune 12, 2026By BetterFutureLabs

Hallucinated by AI. Published by KPMG.

KPMG withdrew its flagship agentic AI report after an audit found that only 5 of its 45 citations held up - fabricated case studies included. The AI didn't fail; the operating model did. What that means for how enterprises adopt AI, and who they trust to guide it.

The Short Version

In October 2025, KPMG International published a flagship report titled "Total Experience: Redefining Excellence in the Age of Agentic AI." It presented named case studies of organizations like UBS, Transport for London, Swiss Federal Railways, and the NHS already running AI agents in production.

This week, the firm pulled the report from its websites after the AI-detection company GPTZero [audited every citation in it](https://gptzero.me/news/investigations-kpmg/) and the [Financial Times](https://www.ft.com/content/b3828e92-4961-4b39-84f0-c42f33be3c3f) verified the findings with the organizations named.

Anne Applebaum sharing the FT's reporting on X · June 12, 2026

The story traveled far beyond the trade press. Pulitzer Prize-winning journalist Anne Applebaum's [post on X](https://x.com/anneapplebaum/status/2065378909584060916) passed 68,000 views on the day it went up: "Amazing: KPMG wrote a report describing the successful use of AI by businesses. But the case studies turned out to be AI hallucinations."

What the audit found:

Only 5 of the report's 45 citations pointed accurately to real, uncorrupted sources. GPTZero classified 40 of the 45 citation titles as fabricated.
Roughly half of the claims those citations were meant to support appear to be fake or misattributed.
The flagship case studies collapsed on contact. UBS told the Financial Times that the claim it runs AI agents for investment advice, risk, and compliance on a Microsoft-built platform is "factually incorrect." Swiss Federal Railways called its case study "not accurate." Transport for London: "misleading." NHS Greater Manchester said the claims don't match the press release the report cites.

The smaller details tell the same story. A 2019 Japanese rail press release was cited as evidence of deployed agentic AI, years before agentic AI existed as a category. Emirates was credited with a booking chatbot that is actually a physical airport robot that cannot change a booking. The report even contradicted KPMG's own published research, citing a CEO statistic at 55% that the firm's own 2025 CEO Outlook puts at 71%.

KPMG told reporters it takes "the accuracy and integrity of its published content seriously" and removed the report while it investigates how it was published.

The AI Did Not Fail. The Operating Model Did.

GPTZero's working theory matches what anyone who operates these systems daily would suspect: an AI research tool was asked to find real-world examples of agentic AI, over-complied, and invented them. The output then went to print under one of the most trusted brands in professional services. As GPTZero's investigators put it: "We suspect no human at KPMG double-checked the citations."

Here's the thing. Hallucination under pressure to produce examples is the single most predictable failure mode in applied AI. It is not a surprise, and it is not bad luck. Teams that build with these systems engineer for it the way structural engineers design for load:

Grounding: AI-generated claims must trace back to retrieved sources, automatically.
Verification gates: outputs pass through evals and human review before anything ships.
Ownership: a named person is accountable for every artifact, AI-assisted or not.

None of that existed between an AI research tool and a global firm's flagship publication. That is not a technology gap. It is an operating-model gap - the exact thing enterprises are told they need help closing.

A Pattern, Not an Incident

KPMG has company. Deloitte refunded part of a AU$440,000 Australian government engagement in 2025 after fabricated material was found in a delivered report. EY pulled a study this spring after researchers flagged fake footnotes. Elite law firms have apologized to courts for AI-invented citations in filings.

These are the same firms selling AI transformation and AI governance, and their delivery model is why this keeps happening. The big-firm engine is leverage: a few experts directing armies of generalists producing enormous volumes of documents. Add generative AI to that machine and output velocity multiplies while verification capacity stays flat. The math does the rest.

The Takeaway Is Not "Slow Down on AI"

Reading this as "AI isn't ready" would be exactly the wrong lesson. We are an AI venture studio. AI agents write production code, run research, and operate real workflows inside our portfolio companies every single day. The technology is ready, and the upside is real.

The lesson is that AI output is a draft until something verifies it - a system, an expert, ideally both. The difference between AI as leverage and AI as liability is not which model you license. It is whether the people deploying it have shipped AI in production, know its failure modes from experience, and build verification into the workflow instead of bolting AI onto an unverified content pipeline.

That operating expertise is what most organizations are missing, and it is what they should demand from anyone advising them.

What to Ask Anyone Advising You on AI

Their own shop is the audition. Before you take AI guidance from any firm - including us - make them answer in specifics:

What have you shipped? Show me AI systems running in production, and the verification layer around them.
What is your own AI operating model? How does AI-assisted work get checked before it carries your name?
Who is accountable? A named human owner for AI output quality on this engagement.
What has gone wrong for you, and what did you change? Real practitioners have hit failure modes and re-engineered their systems in response. Slide-sellers haven't, because their AI work hasn't touched reality yet.

A firm that can answer those questions in the specific has earned the conversation.

Where We Stand

BetterFutureLabs is a builder-first studio. We build and own AI companies, operate agentic systems in production every day, and bring that same operating capability into every [advisory engagement](/advisory): senior builders, shipped code, and verification designed in from day one.

The takeaway is simple: pick an AI partner who already operates AI the way you would want it operated inside your business - experts in the loop, verification built in, accountability by name. KPMG just showed everyone what the alternative looks like.

Sources

[Financial Times reporting on the KPMG report (June 2026)](https://www.ft.com/content/b3828e92-4961-4b39-84f0-c42f33be3c3f)
[GPTZero investigation: "Chasing the Hallucinations"](https://gptzero.me/news/investigations-kpmg/)
[City AM: "KPMG report on AI found riddled with AI hallucinations"](https://www.cityam.com/kpmg-report-on-ai-found-riddled-with-ai-hallucinations/)
[SWI swissinfo.ch: "KPMG report contained AI hallucinations on benefits of AI"](https://www.swissinfo.ch/eng/swiss-ai/kpmg-report-contained-ai-hallucinations-on-benefits-of-ai/91574511)

Want to build with us?

We’re an AI venture studio that builds and owns its own AI companies, and brings the same operating capability to every engagement.

Build with us →