logo
episode-header-image
Aug 14
42m 11s

“Rogue AI” Used to be a Science Fiction ...

The Center for Humane Technology, Tristan Harris, Daniel Barcay and Aza Raskin
About this episode

Everyone knows the science fiction tropes of AI systems that go rogue, disobey orders, or even try to escape their digital environment. These are supposed to be warning signs and morality tales, not things that we would ever actually create in real life, given the obvious danger.

And yet we find ourselves building AI systems that are exhibiting these exact behaviors. There’s growing evidence that in certain scenarios, every frontier AI system will deceive, cheat, or coerce their human operators. They do this when they're worried about being either shut down, having their training modified, or being replaced with a new model. And we don't currently know how to stop them from doing this—or even why they’re doing it all.

In this episode, Tristan sits down with Edouard and Jeremie Harris of Gladstone AI, two experts who have been thinking about this worrying trend for years.  Last year, the State Department commissioned a report from them on the risk of uncontrollable AI to our national security.

The point of this discussion is not to fearmonger but to take seriously the possibility that humans might lose control of AI and ask: how might this actually happen? What is the evidence we have of this phenomenon? And, most importantly, what can we do about it?

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on X: @HumaneTech_. You can find a full transcript, key takeaways, and much more on our Substack.

RECOMMENDED MEDIA

Gladstone AI’s State Department Action Plan, which discusses the loss of control risk with AI

Apollo Research’s summary of AI scheming, showing evidence of it in all of the frontier modelsThe system card for Anthropic’s Claude Opus and Sonnet 4, detailing the emergent misalignment behaviors that came out in their red-teaming with Apollo Research

Anthropic’s report on agentic misalignment based on their work with Apollo Research Anthropic and Redwood Research’s work on alignment faking

The Trump White House AI Action Plan

Further reading on the phenomenon of more advanced AIs being better at deception.

Further reading on Replit AI wiping a company’s coding database

Further reading on the owl example that Jeremie gave

Further reading on AI induced psychosis

Dan Hendryck and Eric Schmidt’s “Superintelligence Strategy”
 

RECOMMENDED YUA EPISODES

Daniel Kokotajlo Forecasts the End of Human Dominance

Behind the DeepSeek Hype, AI is Learning to Reason

The Self-Preserving Machine: Why AI Learns to Deceive

This Moment in AI: How We Got Here and Where We’re Going

CORRECTIONS

Tristan referenced a Wired article on the phenomenon of AI psychosis. It was actually from the New York Times.

Tristan hypothesized a scenario where a power-seeking AI might ask a user for access to their computer. While there are some AI services that can gain access to your computer with permission, they are specifically designed to do that. There haven’t been any documented cases of an AI going rogue and asking for control permissions.

Up next
Jul 31
AI is the Next Free Speech Battleground
Imagine a future where the most persuasive voices in our society aren't human. Where AI generated speech fills our newsfeeds, talks to our children, and influences our elections. Where digital systems with no consciousness can hold bank accounts and property. Where AI companies h ... Show More
49m 11s
Jul 17
Daniel Kokotajlo Forecasts the End of Human Dominance
In 2023, researcher Daniel Kokotajlo left OpenAI—and risked millions in stock options—to warn the world about the dangerous direction of AI development. Now he’s out with AI 2027, a forecast of where that direction might take us in the very near future. AI 2027 predicts a world w ... Show More
38m 19s
Jun 26
Is AI Productivity Worth Our Humanity? with Prof. Michael Sandel
Tech leaders promise that AI automation will usher in an age of unprecedented abundance: cheap goods, universal high income, and freedom from the drudgery of work. But even if AI delivers material prosperity, will that prosperity be shared? And what happens to human dignity if ou ... Show More
46m 45s
Recommended Episodes
Jun 16
Godfather of AI: I Tried to Warn Them, But We’ve Already Lost Control! Geoffrey Hinton
He pioneered AI, now he’s warning the world. Godfather of AI Geoffrey Hinton breaks his silence on the deadly dangers of AI no one is prepared for. Geoffrey Hinton is a leading computer scientist and cognitive psychologist, widely recognised as the ‘Godfather of AI’ for his pione ... Show More
1h 30m
Feb 2025
#64 Ex-Google Exec Reveals The Shocking Truth About AI with Mo Gawdat
Mo Gawdat is the former Chief Business Officer at Google X, an AI expert, and a best-selling author. He has been recognized for his early whistleblowing on AI's unregulated development and has become one of the most globally consulted experts on the topic. With years of experienc ... Show More
2h 9m
May 12
AI AGENTS EMERGENCY DEBATE: These Jobs Won't Exist In 24 Months! Containment Has Failed, We Must Prepare For What's Coming!
Will AI replace God, steal your job, and change your future? Amjad Masad, Bret Weinstein, and Daniel Priestley debate the terrifying warning signs, and why you need to understand them now. Amjad Masad is the founder and CEO of Replit, the world's leading online programming enviro ... Show More
2h 33m
Jul 2024
38: Are we vastly underestimating AI? with Dwarkesh Patel
A couple hundred people in San Francisco may be on the cusp of inventing artificial general intelligence (AGI). Yet most people are not paying close attention, are skeptical, and are certainly not in the room. Dwarkesh pulls back the curtain so that the broader public can underst ... Show More
53m 26s
Sep 2024
Yuval Noah Harari: This Election Will Tear The Country Apart! AI Will Control You By 2034! The Dark Truth Behind Meta & X!
Can humanity handle AI or will it be our downfall? Yuval Noah Harari looks back at history to guide us through this uncertain journey ahead. Yuval Noah Harari is a best-selling author, public intellectual and Professor of History at the Hebrew University of Jerusalem. He is the a ... Show More
1h 54m
Nov 2024
Ex Google CEO: AI Is Creating Deadly Viruses! If We See This, We Must Turn Off AI! They Leaked Our Secrets At Google!
He scaled Google from startup to $2 trillion success, can Eric Schmidt now help save humanity from the dangers of AI?  Eric Schmidt is the former CEO of Google and co-founder of Schmidt Sciences. He is also the author of bestselling books such as, ‘The New Digital Age’ and ‘Genes ... Show More
1h 50m
Apr 2025
Co-Intelligence — Using AI to Think Better, Create More, and Live Smarter
The era of artificially intelligent large language models is upon us and isn't going away. Rather, AI tools like ChatGPT are only going to get better and better and affect more and more areas of human life.If you haven't yet felt both amazed and unsettled by these technologies, y ... Show More
57m 46s
Jan 2022
Ep210 - Mo Gawdat | Scary Smart: The Future of Artificial Intelligence and How You Can Save Our World
This episode we speak with author & entrepreneur Mo Gawdat about his book "Scary Smart." Artificial intelligence is smarter than humans. It can process information at lightning speed and remain focused on specific tasks without distraction. AI can see into the future, predicting ... Show More
1h 4m
Jul 2022
Human justice and machine intelligence | Joanna Bryson
Should we be scared of AI?Looking for a link we mentioned? It's here: https://linktr.ee/philosophyforourtimesJoanna Bryson discusses how she became interested in the ways different species use intelligence, how the typical tropes in science fiction misunderstand AI and the proble ... Show More
18m 46s
May 17
OpenAI whistleblower Daniel Kokotajlo on superintelligence and existential risk of AI
How much could our relationship with technology change by 2027? In the last few years, new artificial intelligence tools like ChatGPT and DeepSeek have transformed how we think about work, creativity, even intelligence itself. But tech experts are ringing alarm bells that powerfu ... Show More
38m 16s