logo
episode-header-image
Jun 2023
3h 9m

#154 - Rohin Shah on DeepMind and trying...

Rob, Luisa, and the 80,000 Hours team
About this episode

Can there be a more exciting and strange place to work today than a leading AI lab? Your CEO has said they're worried your research could cause human extinction. The government is setting up meetings to discuss how this outcome can be avoided. Some of your colleagues think this is all overblown; others are more anxious still.

Today's guest — machine learning researcher Rohin Shah — goes into the Google DeepMind offices each day with that peculiar backdrop to his work.

Links to learn more, summary and full transcript.

He's on the team dedicated to maintaining 'technical AI safety' as these models approach and exceed human capabilities: basically that the models help humanity accomplish its goals without flipping out in some dangerous way. This work has never seemed more important.

In the short-term it could be the key bottleneck to deploying ML models in high-stakes real-life situations. In the long-term, it could be the difference between humanity thriving and disappearing entirely.

For years Rohin has been on a mission to fairly hear out people across the full spectrum of opinion about risks from artificial intelligence -- from doomers to doubters -- and properly understand their point of view. That makes him unusually well placed to give an overview of what we do and don't understand. He has landed somewhere in the middle — troubled by ways things could go wrong, but not convinced there are very strong reasons to expect a terrible outcome.

Today's conversation is wide-ranging and Rohin lays out many of his personal opinions to host Rob Wiblin, including:

  • What he sees as the strongest case both for and against slowing down the rate of progress in AI research.
  • Why he disagrees with most other ML researchers that training a model on a sensible 'reward function' is enough to get a good outcome.
  • Why he disagrees with many on LessWrong that the bar for whether a safety technique is helpful is “could this contain a superintelligence.”
  • That he thinks nobody has very compelling arguments that AI created via machine learning will be dangerous by default, or that it will be safe by default. He believes we just don't know.
  • That he understands that analogies and visualisations are necessary for public communication, but is sceptical that they really help us understand what's going on with ML models, because they're different in important ways from every other case we might compare them to.
  • Why he's optimistic about DeepMind’s work on scalable oversight, mechanistic interpretability, and dangerous capabilities evaluations, and what each of those projects involves.
  • Why he isn't inherently worried about a future where we're surrounded by beings far more capable than us, so long as they share our goals to a reasonable degree.
  • Why it's not enough for humanity to know how to align AI models — it's essential that management at AI labs correctly pick which methods they're going to use and have the practical know-how to apply them properly.
  • Three observations that make him a little more optimistic: humans are a bit muddle-headed and not super goal-orientated; planes don't crash; and universities have specific majors in particular subjects.
  • Plenty more besides.

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris

Audio mastering: Milo McGuire, Dominic Armstrong, and Ben Cordell

Transcriptions: Katy Moore

Up next
Jul 8
#220 – Ryan Greenblatt on the 4 most likely ways for AI to take over, and the case for and against AGI in <8 years
Ryan Greenblatt — lead author on the explosive paper “Alignment faking in large language models” and chief scientist at Redwood Research — thinks there’s a 25% chance that within four years, AI will be able to do everything needed to run an AI company, from writing code to design ... Show More
2h 50m
Jun 24
#219 – Toby Ord on graphs AI companies would prefer you didn't (fully) understand
The era of making AI smarter just by making it bigger is ending. But that doesn’t mean progress is slowing down — far from it. AI models continue to get much more powerful, just using very different methods, and those underlying technical changes force a big rethink of what comin ... Show More
2h 48m
Jun 12
#218 – Hugh White on why Trump is abandoning US hegemony – and that’s probably good
For decades, US allies have slept soundly under the protection of America’s overwhelming military might. Donald Trump — with his threats to ditch NATO, seize Greenland, and abandon Taiwan — seems hell-bent on shattering that comfort.But according to Hugh White — one of the world' ... Show More
2h 48m
Recommended Episodes
May 2023
Spotlight: AI Myths and Misconceptions
A few episodes back, we presented Tristan Harris and Aza Raskin’s talk The AI Dilemma. People inside the companies that are building generative artificial intelligence came to us with their concerns about the rapid pace of deployment and the problems that are emerging as a result ... Show More
26m 48s
Mar 2023
#312 — The Trouble with AI
Sam Harris speaks with Stuart Russell and Gary Marcus about recent developments in artificial intelligence and the long-term risks of producing artificial general intelligence (AGI). They discuss the limitations of Deep Learning, the surprising power of narrow AI, ChatGPT, a poss ... Show More
1h 26m
Jan 2022
The race to build AI that benefits humanity | Sam Altman
Will innovation in artificial intelligence drastically improve our lives, or destroy humanity as we know it? From the unintended consequences we've suffered from platforms like Facebook and YouTube to the danger of creating technology we can't control, it's easy to see why people ... Show More
1h 8m
Feb 2016
Paradigms in Artificial Intelligence
Artificial intelligence includes a number of different strategies for how to make machines more intelligent, and often more human-like, in their ability to learn and solve problems. An ambitious group of researchers is working right now to classify all the approaches to AI, perha ... Show More
17m 20s
Jan 2023
24. Artificial Intelligence: What Is It? What Is It Not? (feat. Susan Farrell, Principal UX Researcher at mmhmm.app)
The term artificial intelligence, AI, is having a bit of a boom, with the explosion in popularity of tools like ChatGPT, Lensa, DALL•E 2, and many others. The praises of AI have been equally met with skepticism and criticism, with cautionary tales about AI information quality, pl ... Show More
35m 35s
Aug 2023
How A.I. Will Destroy Or Save Humanity w/ Mo Gawdat
The former Chief Business Officer of GOOGLE X has a WARNING for us about AI and you NEED to hear it! THIS is the turning point for humanity…While many people are touting AI’s incredible benefits, others are striking a more cautionary tone about the future of AI, including this we ... Show More
1h 12m
Dec 2023
Michael Wooldridge on AI and sentient robots
Humans have a long-held fascination with the idea of Artificial Intelligence (AI) as a dystopian threat: from Mary Shelley's Frankenstein, through to the Terminator movies.But somehow, we still often think of this technology as 'futuristic': whereas in fact, it's already woven in ... Show More
37m 55s
Jan 2024
Why AI Should Be Taught to Know Its Limits
One of AI’s biggest, unsolved problems is what the advanced algorithms should do when they confront a situation they don’t have an answer for. For programs like Chat GPT, that could mean providing a confidently wrong answer, what’s often called a “hallucination”; for others, as w ... Show More
17m 43s
May 2023
#239: Will Artificial Intelligence Replace You Soon? AI Expert Fahed Bizzari Reveals Everything
It's time we address the elephant in the room. There's no doubt that artificial intelligence is at the forefront of every industry and around every corner. But the big question on everyone's mind is: "Will I be replaced by a robot soon?". In this episode, we discuss exactly that ... Show More
38m 49s