logo
episode-header-image
Sep 26
1h 6m

Richard Sutton – Father of RL thinks LLM...

Dwarkesh Patel
About this episode

Richard Sutton is the father of reinforcement learning, winner of the 2024 Turing Award, and author of The Bitter Lesson. And he thinks LLMs are a dead end.

After interviewing him, my steel man of Richard’s position is this: LLMs aren’t capable of learning on-the-job, so no matter how much we scale, we’ll need some new architecture to enable continual learning.

And once we have it, we won’t need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals.

This new paradigm will render our current approach with LLMs obsolete.

In our interview, I did my best to represent the view that LLMs might function as the foundation on which experiential learning can happen… Some sparks flew.

A big thanks to the Alberta Machine Intelligence Institute for inviting me up to Edmonton and for letting me use their studio and equipment.

Enjoy!

Watch on YouTube; listen on Apple Podcasts or Spotify.

Sponsors

* Labelbox makes it possible to train AI agents in hyperrealistic RL environments. With an experienced team of applied researchers and a massive network of subject-matter experts, Labelbox ensures your training reflects important, real-world nuance. Turn your demo projects into working systems at labelbox.com/dwarkesh

* Gemini Deep Research is designed for thorough exploration of hard topics. For this episode, it helped me trace reinforcement learning from early policy gradients up to current-day methods, combining clear explanations with curated examples. Try it out yourself at gemini.google.com

* Hudson River Trading doesn’t silo their teams. Instead, HRT researchers openly trade ideas and share strategy code in a mono-repo. This means you’re able to learn at incredible speed and your contributions have impact across the entire firm. Find open roles at hudsonrivertrading.com/dwarkesh

Timestamps

(00:00:00) – Are LLMs a dead end?

(00:13:04) – Do humans do imitation learning?

(00:23:10) – The Era of Experience

(00:33:39) – Current architectures generalize poorly out of distribution

(00:41:29) – Surprises in the AI field

(00:46:41) – Will The Bitter Lesson still apply post AGI?

(00:53:48) – Succession to AIs



Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Up next
Nov 12
Satya Nadella — How Microsoft is preparing for AGI
<p>As part of this interview, Satya Nadella gave Dylan Patel (founder of <a target="_blank" href="https://semianalysis.com/">SemiAnalysis</a>) and me an exclusive first-look at their brand-new Fairwater 2 datacenter.</p><p>Microsoft is building multiple Fairwaters, each of which ... Show More
1h 27m
Oct 31
Sarah Paine – How Russia sabotaged China's rise
In this lecture, military historian Sarah Paine explains how Russia—and specifically Stalin—completely derailed China’s rise, slowing them down for over a century.This lecture was particularly interesting to me because, in my opinion, the Chinese Civil War is 1 of the top 3 most ... Show More
1h 30m
Oct 17
Andrej Karpathy — AGI is still a decade away
The Andrej Karpathy episode.During this interview, Andrej explains why reinforcement learning is terrible (but everything else is much worse), why AGI will just blend into the previous ~2.5 centuries of 2% GDP growth, why self driving took so long to crack, and what he sees as th ... Show More
2h 25m
Recommended Episodes
Sep 12
The AI revolution is underhyped | Eric Schmidt
The arrival of non-human intelligence is a very big deal, says former Google CEO and chairman Eric Schmidt. In a wide-ranging interview with technologist Bilawal Sidhu, Schmidt makes the case that AI is wildly underhyped, as near-constant breakthroughs give rise to systems capabl ... Show More
29m 51s
Sep 2024
Bret Taylor - The Agent Era - [Invest Like the Best, EP.386]
My guest today is Bret Taylor. His resume is absurd. He built google maps--famously rewriting the whole thing in a weekend. He was the CTO of Facebook in critical years. He founded Quip. He was the chair of the board at Twitter. He was the co-CEO of Salesforce...the incredible li ... Show More
1h 27m
Sep 17
AI: Copilot or Job Killer? - An Interview With Eliman Dambell
Some CEOs brag about using AI to cut jobs. But there’s another way to see it.In this episode, I sit down with Eliman Dambell, co-founder of Savvio.ai and former London finance director turned crypto analyst. He brings a unique perspective on why “AI should be a copilot, not a rep ... Show More
31m 53s
Sep 2024
The Frontier of Spatial Intelligence with Fei-Fei Li
<p>Fei-Fei Li and Justin Johnson are pioneers in AI. While the world has only recently witnessed a surge in consumer AI, our guests have long been laying the groundwork for innovations that are transforming industries today.</p><p>In this episode, a16z General Partner Martin Casa ... Show More
44m 40s
Jul 2025
AI Just Achieved Something No One Thought it Would Until Years From Now
An experimental reasoning model from OpenAI and Deep Thinking model from Gemini just achieved a Gold Medal performance at the International Math Olympiad. In both cases, the models solved 5 out of 6 IMO problems without any external tools, using pure mathematical reasoning that r ... Show More
26m 5s
Aug 2019
AI, Robot
Forget what sci-fi has told you about superintelligent robots that are uncannily human-like; the reality is more prosaic. Inside DeepMind’s robotics laboratory, Hannah explores what researchers call ‘embodied AI’: robot arms that are learning tasks like picking up plastic bricks, ... Show More
32m 33s
Nov 2024
833: The 10 Reasons AI Projects Fail, with Dr. Martin Goodson
Martin Goodson speaks to Jon Krohn about what he would add to his viral article “Ten Ways Your Data Project is Going to Fail”, why practitioners always need to be present at AI policy discussions, and Evolution AI’s breakthroughs in computer vision and NLP. This episode is broug ... Show More
1h 25m
Oct 2024
Tesla's Road Ahead: The Bitter Lesson in Robotics
<p>What does Rich Sutton’s "Bitter Lesson" reveal about the decisions Tesla is making in its pursuit of autonomy?</p><p>In this episode, we dive into Tesla’s recent "We, Robot" event, where they unveiled bold plans for the unsupervised full-self-driving Cybercab, Robovan, and Opt ... Show More
36m 46s
Aug 18
The Life Scientific: Neil Lawrence
When you think of Artificial Intelligence, does it inspire confidence, or concern?Although it's now generally accepted that this technology will play a major role in our future, a lot of conversations around AI and machine learning come back to the argument over us losing control ... Show More
26m 29s
Oct 23
Ask Us Anything 2025
It's been another big year in AI. The AI race has accelerated to breakneck speed, with frontier labs pouring hundreds of billions into increasingly powerful models—each one smarter, faster, and more unpredictable than the last. We’re starting to see disruptions in the workforce a ... Show More
40m 53s