Anthropic just released Claude Opus 4.8, and the headline improvement is unusual: the model is built to flag its own uncertainty and say "I'm not sure." Anthropic says it's roughly four times less likely to let a flaw pass without catching it. When a company's flagship upgrade is honesty, that tells you something about where we are.
Here is the other side of it. Harrison asked Google's Gemini one simple factual question for an article he was writing: did Jeff Dunham use AI to create the opening visuals for his 2024 comedy special? Gemini said yes, confidently, and cited a source. When Harrison pushed on that source, the tool did not check itself. It invented a new one. Then another. By the end it had manufactured four separate references, including a word-for-word on-screen quote that does not exist, before finally admitting the only real source was a single unsourced blog post.
This episode walks the whole chain step by step. You will learn:
- The exact failure mode: when an AI hits a popular but unverified claim, it gets confident instead of careful, and every round of pushback produces a fresh citation instead of a fresh doubt.
- Why the Vectara Hallucination Leaderboard shows roughly one in ten outputs is wrong on a task as simple as summarizing a document.
- A five-step, 30-minute verification process you can run on almost any claim before you repeat it.
- Where source verification sits in The 7 Levels of AI Proficiency (it defines Level 3, the Critical Thinker) and why that is the level every working professional should be reaching for in 2026.
- Three things to do this week to protect your own credibility.
This is not an anti-AI episode. Harrison uses these tools every day. It is about the difference between trusting a tool blindly and trusting it after you have checked. That second posture is what separates an amateur from a professional whose name is on the line.
Want to know where you stand? The 7 Levels of AI Proficiency assessment is free and takes 10 minutes: assess.launchready.ai
Harrison Painter
Executive AI Advisor
LaunchReady.ai.
Further. Faster.