logo
episode-header-image
Oct 2023
3h 7m

Paul Christiano - Preventing an AI Takeo...

Dwarkesh Patel
About this episode

Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!

We discuss:

- Does he regret inventing RLHF, and is alignment necessarily dual-use?

- Why he has relatively modest timelines (40% by 2040, 15% by 2030),

- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?

- Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,

- His current research into a new proof system, and how this could solve alignment by explaining model's behavior

- and much more.

Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

Open Philanthropy

Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations.

For more information and to apply, please see the application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/

The deadline to apply is November 9th; make sure to check out those roles before they close.

Timestamps

(00:00:00) - What do we want post-AGI world to look like?

(00:24:25) - Timelines

(00:45:28) - Evolution vs gradient descent

(00:54:53) - Misalignment and takeover

(01:17:23) - Is alignment dual-use?

(01:31:38) - Responsible scaling policies

(01:58:25) - Paul’s alignment research

(02:35:01) - Will this revolutionize theoretical CS and math?

(02:46:11) - How Paul invented RLHF

(02:55:10) - Disagreements with Carl Shulman

(03:01:53) - Long TSMC but not NVIDIA



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.dwarkeshpatel.com
Up next
Aug 21
Why evolution designed us to die fast, & how we can change that – Jacob Kimmel
Jacob Kimmel thinks he can find the transcription factors to reverse aging. We do a deep dive on why this might be plausible and why evolution hasn’t already optimized for longevity. We also talk about why drug discovery has been getting exponentially harder, and what a new platf ... Show More
1h 44m
Aug 15
China is killing the US on energy. Does that mean they’ll win AGI? - Casey Handmer
How will we feed the 100s of GWs of extra energy demand that AI will create over the next decade? On this episode, Casey Handmer (Caltech PhD, former NASA JPL, founder & CEO of Terraform Industries) walks me through how we can pull it off, and why he thinks a major part of this e ... Show More
1h 8m
Aug 7
The surprising economics of the meat industry – Lewis Bollard
A deep dive with Lewis Bollard, who leads Open Philanthropy’s strategy for Farm Animal Welfare, on the surprising economics of the meat industry.Why is factory farming so efficient? How can we make the lives of the 23+ billion animals living on factory farms more bearable? How fa ... Show More
1h 8m
Recommended Episodes
Oct 2023
What AI means for your product strategy | Paul Adams (CPO of Intercom)
Paul Adams is the longtime chief product officer at Intercom, where he leads the product management, product design, data science, and research teams. Before Intercom, Paul was the global head of brand design at Facebook, a senior user researcher at Google, and a product designer ... Show More
1h 23m
Dec 2019
Judea Pearl: Causal Reasoning, Counterfactuals, Bayesian Networks, and the Path to AGI
Judea Pearl is a professor at UCLA and a winner of the Turing Award, that’s generally recognized as the Nobel Prize of computing. He is one of the seminal figures in the field of artificial intelligence, computer science, and statistics. He has developed and championed probabilis ... Show More
1h 23m
Feb 2023
The bot Cicero can collaborate, scheme and build trust with humans. What does this mean for the next frontier of AI? With Noam Brown, Research Scientist at Meta
AGI can beat top players in chess, poker, and, now, Diplomacy. In November 2022, a bot named Cicero demonstrated mastery in this game, which requires natural language negotiation and cooperation with humans. In short, Cicero can lie, scheme, build trust, pass as human, and ally w ... Show More
58m 40s
Jun 2024
Ep. 306: Defusing AI Panic
One of the simmering concerns surrounding the current AI revolution is the fear that we might accidentally create an “alien mind” smarter than we expected. In this episode, Cal puts on his Computer Scientist hat and directly addresses this potential by sketching his emerging conc ... Show More
1h 42m
Jun 2023
#154 - Rohin Shah on DeepMind and trying to fairly hear out both AI doomers and doubters
Can there be a more exciting and strange place to work today than a leading AI lab? Your CEO has said they're worried your research could cause human extinction. The government is setting up meetings to discuss how this outcome can be avoided. Some of your colleagues think this i ... Show More
3h 9m
Nov 2023
Geoffrey Hinton: ‘It’s Far Too Late’ to Stop Artificial Intelligence
Artificial intelligence has made headlines all year long, but the turn of events this week was extraordinary. OpenAI was thrown into chaos with the firing and eventual rehiring of CEO Sam Altman. There was a shakeup in the company’s board of directors and fierce debates about how ... Show More
35m 59s
Apr 2023
Ep. 244: Thoughts on ChatGPT
Are new AI technologies like ChatGPT about to massively disrupt our world? Drawing from his recent New Yorker article on the topic, Cal explains exactly how programs like ChatGPT work, and uses this knowledge to explain why we can calm our fears about this new technology. Below a ... Show More
1h 29m
May 2024
The TED AI Show: Is AI destroying our sense of reality? with Sam Gregory
Could you spot a deepfake? We’re entering a new world where generative AI is challenging our sense of what’s real and what’s fiction. In our first episode, Bilawal and Sam Gregory, a human rights activist and technologist, discuss how to protect our sense of reality.This is an ep ... Show More
27m 25s
Dec 2023
Uncontrollable AI Risks
We are joined by Darren McKee, a Policy Advisor and the host of Reality Check — a critical thinking podcast. Darren gave a background about himself and how he got into the AI space. Darren shared his thoughts on AGI's achievements in the coming years. He defined AGI and discussed ... Show More
38m 43s