logo
episode-header-image
Oct 2023
3h 7m

Paul Christiano - Preventing an AI Takeo...

Dwarkesh Patel
About this episode

Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!

We discuss:

- Does he regret inventing RLHF, and is alignment necessarily dual-use?

- Why he has relatively modest timelines (40% by 2040, 15% by 2030),

- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?

- Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,

- His current research into a new proof system, and how this could solve alignment by explaining model's behavior

- and much more.

Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

Open Philanthropy

Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations.

For more information and to apply, please see the application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/

The deadline to apply is November 9th; make sure to check out those roles before they close.

Timestamps

(00:00:00) - What do we want post-AGI world to look like?

(00:24:25) - Timelines

(00:45:28) - Evolution vs gradient descent

(00:54:53) - Misalignment and takeover

(01:17:23) - Is alignment dual-use?

(01:31:38) - Responsible scaling policies

(01:58:25) - Paul’s alignment research

(02:35:01) - Will this revolutionize theoretical CS and math?

(02:46:11) - How Paul invented RLHF

(02:55:10) - Disagreements with Carl Shulman

(03:01:53) - Long TSMC but not NVIDIA



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.dwarkeshpatel.com
Up next
Jul 3
Why I don’t think AGI is right around the corner
I’ve had a lot of discussions on my podcast where we haggle out timelines to AGI. Some guests think it’s 20 years away - others 2 years. Here’s an audio version of where my thoughts stand as of June 2025. If you want to read the original post, you can check it out here. Get full ... Show More
14m 1s
Jun 2
Why I don’t think AGI is right around the corner
I’ve had a lot of discussions on my podcast where we haggle out timelines to AGI. Some guests think it’s 20 years away - others 2 years. Here’s where my thoughts stand as of June 2025. Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe 
14m 1s
Jun 26
Godfather of Synthetic Bio on De-Aging, De-Extinction, & Weaponized Mirror Life — George Church
George Church is the godfather of modern synthetic biology and has been involved with basically every major biotech breakthrough in the last few decades.Professor Church thinks that these improvements (e.g., orders of magnitude decrease in sequencing & synthesis costs, precise ge ... Show More
1h 33m
Recommended Episodes
Oct 2023
What AI means for your product strategy | Paul Adams (CPO of Intercom)
Paul Adams is the longtime chief product officer at Intercom, where he leads the product management, product design, data science, and research teams. Before Intercom, Paul was the global head of brand design at Facebook, a senior user researcher at Google, and a product designer ... Show More
1h 23m
Dec 2019
Judea Pearl: Causal Reasoning, Counterfactuals, Bayesian Networks, and the Path to AGI
Judea Pearl is a professor at UCLA and a winner of the Turing Award, that’s generally recognized as the Nobel Prize of computing. He is one of the seminal figures in the field of artificial intelligence, computer science, and statistics. He has developed and championed probabilis ... Show More
1h 23m
Feb 2023
The bot Cicero can collaborate, scheme and build trust with humans. What does this mean for the next frontier of AI? With Noam Brown, Research Scientist at Meta
AGI can beat top players in chess, poker, and, now, Diplomacy. In November 2022, a bot named Cicero demonstrated mastery in this game, which requires natural language negotiation and cooperation with humans. In short, Cicero can lie, scheme, build trust, pass as human, and ally w ... Show More
58m 40s
Jun 2024
Ep. 306: Defusing AI Panic
One of the simmering concerns surrounding the current AI revolution is the fear that we might accidentally create an “alien mind” smarter than we expected. In this episode, Cal puts on his Computer Scientist hat and directly addresses this potential by sketching his emerging conc ... Show More
1h 42m
Jun 2023
#154 - Rohin Shah on DeepMind and trying to fairly hear out both AI doomers and doubters
Can there be a more exciting and strange place to work today than a leading AI lab? Your CEO has said they're worried your research could cause human extinction. The government is setting up meetings to discuss how this outcome can be avoided. Some of your colleagues think this i ... Show More
3h 9m
Nov 2023
Geoffrey Hinton: ‘It’s Far Too Late’ to Stop Artificial Intelligence
Artificial intelligence has made headlines all year long, but the turn of events this week was extraordinary. OpenAI was thrown into chaos with the firing and eventual rehiring of CEO Sam Altman. There was a shakeup in the company’s board of directors and fierce debates about how ... Show More
35m 59s
Apr 2023
Ep. 244: Thoughts on ChatGPT
Are new AI technologies like ChatGPT about to massively disrupt our world? Drawing from his recent New Yorker article on the topic, Cal explains exactly how programs like ChatGPT work, and uses this knowledge to explain why we can calm our fears about this new technology. Below a ... Show More
1h 29m
May 2024
The TED AI Show: Is AI destroying our sense of reality? with Sam Gregory
Could you spot a deepfake? We’re entering a new world where generative AI is challenging our sense of what’s real and what’s fiction. In our first episode, Bilawal and Sam Gregory, a human rights activist and technologist, discuss how to protect our sense of reality.This is an ep ... Show More
27m 25s
Dec 2023
Uncontrollable AI Risks
We are joined by Darren McKee, a Policy Advisor and the host of Reality Check — a critical thinking podcast. Darren gave a background about himself and how he got into the AI space. Darren shared his thoughts on AGI's achievements in the coming years. He defined AGI and discussed ... Show More
38m 43s