logo
episode-header-image
Oct 2023
3h 7m

Paul Christiano — Preventing an AI takeo...

Dwarkesh Patel
About this episode

Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!

We discuss:

- Does he regret inventing RLHF, and is alignment necessarily dual-use?

- Why he has relatively modest timelines (40% by 2040, 15% by 2030),

- What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?

- Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,

- His current research into a new proof system, and how this could solve alignment by explaining model's behavior

- and much more.

Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

Open Philanthropy

Open Philanthropy is currently hiring for twenty-two different roles to reduce catastrophic risks from fast-moving advances in AI and biotechnology, including grantmaking, research, and operations.

For more information and to apply, please see the application: https://www.openphilanthropy.org/research/new-roles-on-our-gcr-team/

The deadline to apply is November 9th; make sure to check out those roles before they close.

Timestamps

(00:00:00) - What do we want post-AGI world to look like?

(00:24:25) - Timelines

(00:45:28) - Evolution vs gradient descent

(00:54:53) - Misalignment and takeover

(01:17:23) - Is alignment dual-use?

(01:31:38) - Responsible scaling policies

(01:58:25) - Paul’s alignment research

(02:35:01) - Will this revolutionize theoretical CS and math?

(02:46:11) - How Paul invented RLHF

(02:55:10) - Disagreements with Carl Shulman

(03:01:53) - Long TSMC but not NVIDIA



Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Up next
Yesterday
Ilya Sutskever – We're moving from the age of scaling to the age of research
<p>Ilya & I discuss SSI’s strategy, the problems with pre-training, how to improve the generalization of AI models, and how to ensure AGI goes well.</p><p>Watch on <a target="_blank" href="https://youtu.be/aR20FWCCjAs">YouTube</a>; read the <a target="_blank" href="https://www.dw ... Show More
1h 36m
Nov 12
Satya Nadella — How Microsoft is preparing for AGI
<p>As part of this interview, Satya Nadella gave Dylan Patel (founder of <a target="_blank" href="https://semianalysis.com/">SemiAnalysis</a>) and me an exclusive first-look at their brand-new Fairwater 2 datacenter.</p><p>Microsoft is building multiple Fairwaters, each of which ... Show More
1h 27m
Oct 31
Sarah Paine – How Russia sabotaged China's rise
In this lecture, military historian Sarah Paine explains how Russia—and specifically Stalin—completely derailed China’s rise, slowing them down for over a century.This lecture was particularly interesting to me because, in my opinion, the Chinese Civil War is 1 of the top 3 most ... Show More
1h 30m
Recommended Episodes
Oct 2023
What AI means for your product strategy | Paul Adams (CPO of Intercom)
<p><strong>Paul Adams</strong> is the longtime chief product officer at Intercom, where he leads the product management, product design, data science, and research teams. Before Intercom, Paul was the global head of brand design at Facebook, a senior user researcher at Google, an ... Show More
1h 23m
Dec 2019
Judea Pearl: Causal Reasoning, Counterfactuals, Bayesian Networks, and the Path to AGI
Judea Pearl is a professor at UCLA and a winner of the Turing Award, that's generally recognized as the Nobel Prize of computing. He is one of the seminal figures in the field of artificial intelligence, computer science, and statistics. He has developed and championed probabilis ... Show More
1h 23m
Feb 2023
The bot Cicero can collaborate, scheme and build trust with humans. What does this mean for the next frontier of AI? With Noam Brown, Research Scientist at Meta
AGI can beat top players in chess, poker, and, now, Diplomacy. In November 2022, a bot named Cicero demonstrated mastery in this game, which requires natural language negotiation and cooperation with humans. In short, Cicero can lie, scheme, build trust, pass as human, and ally w ... Show More
58m 40s
Jun 2024
Ep. 306: Defusing AI Panic
<p>One of the simmering concerns surrounding the current AI revolution is the fear that we might accidentally create an “alien mind” smarter than we expected. In this episode, Cal puts on his Computer Scientist hat and directly addresses this potential by sketching his emerging c ... Show More
1h 42m
Jun 2023
#154 - Rohin Shah on DeepMind and trying to fairly hear out both AI doomers and doubters
<p>Can there be a more exciting and strange place to work today than a leading AI lab? Your CEO has said they're worried your research could cause human extinction. The government is setting up meetings to discuss how this outcome can be avoided. Some of your colleagues think thi ... Show More
3h 9m
Nov 2023
Geoffrey Hinton: ‘It’s Far Too Late’ to Stop Artificial Intelligence
<p>Artificial intelligence has made headlines all year long, but the turn of events this week was extraordinary. OpenAI was thrown into chaos with the firing and eventual rehiring of CEO Sam Altman. There was a shakeup in the company’s board of directors and fierce debates about ... Show More
35m 59s
Apr 2023
Ep. 244: Thoughts on ChatGPT
<p>Are new AI technologies like ChatGPT about to massively disrupt our world? Drawing from his recent New Yorker article on the topic, Cal explains exactly how programs like ChatGPT work, and uses this knowledge to explain why we can calm our fears about this new technology.</p>< ... Show More
1h 29m
May 2024
The TED AI Show: Is AI destroying our sense of reality? with Sam Gregory
<p>Could you spot a deepfake? We’re entering a new world where generative AI is challenging our sense of what’s real and what’s fiction. In our first episode, Bilawal and Sam Gregory, a human rights activist and technologist, discuss how to protect our sense of reality.</p><p>Thi ... Show More
27m 25s
Dec 2023
Uncontrollable AI Risks
We are joined by Darren McKee, a Policy Advisor and the host of Reality Check — a critical thinking podcast. Darren gave a background about himself and how he got into the AI space. Darren shared his thoughts on AGI's achievements in the coming years. He defined AGI and discussed ... Show More
38m 43s