logo
episode-header-image
Feb 2025
52m 30s

π0: A Foundation Model for Robotics with...

Sam Charrington
About this episode

Today, we're joined by Sergey Levine, associate professor at UC Berkeley and co-founder of Physical Intelligence, to discuss π0 (pi-zero), a general-purpose robotic foundation model. We dig into the model architecture, which pairs a vision language model (VLM) with a diffusion-based action expert, and the model training "recipe," emphasizing the roles of pre-training and post-training with a diverse mixture of real-world data to ensure robust and intelligent robot learning. We review the data collection approach, which uses human operators and teleoperation rigs, the potential of synthetic data and reinforcement learning in enhancing robotic capabilities, and much more. We also introduce the team’s new FAST tokenizer, which opens the door to a fully Transformer-based model and significant improvements in learning and generalization. Finally, we cover the open-sourcing of π0 and future directions for their research.


The complete show notes for this episode can be found at https://twimlai.com/go/719.

Up next
Oct 7
Recurrence and Attention for Long-Context Transformers with Jacob Buckman - #750
Today, we're joined by Jacob Buckman, co-founder and CEO of Manifest AI to discuss achieving long context in transformers. We discuss the bottlenecks of scaling context length and recent techniques to overcome them, including windowed attention, grouped query attention, and laten ... Show More
57m 23s
Sep 30
The Decentralized Future of Private AI with Illia Polosukhin - #749
In this episode, Illia Polosukhin, a co-author of the seminal "Attention Is All You Need" paper and co-founder of Near AI, joins us to discuss his vision for building private, decentralized, and user-owned AI. Illia shares his unique journey from developing the Transformer archit ... Show More
1h 5m
Sep 23
Inside Nano Banana 🍌 and the Future of Vision-Language Models with Oliver Wang - #748
Today, we’re joined by Oliver Wang, principal scientist at Google DeepMind and tech lead for Gemini 2.5 Flash Image—better known by its code name, “Nano Banana.” We dive into the development and capabilities of this newly released frontier vision-language model, beginning with th ... Show More
1h 3m
Recommended Episodes
Sep 2024
The Road to Autonomous Intelligence with Andrej Karpathy
Andrej Karpathy joins Sarah and Elad in this week of No Priors. Andrej, who was a founding team member of OpenAI and former Senior Director of AI at Tesla, needs no introduction. In this episode, Andrej discusses the evolution of self-driving cars, comparing Tesla and Waymo’s app ... Show More
44m 16s
Aug 2023
Cuttlefish Model Tuning
Hongyi Wang, a Senior Researcher at the Machine Learning Department at Carnegie Mellon University, joins us. His research is in the intersection of systems and machine learning. He discussed his research paper, Cuttlefish: Low-Rank Model Training without All the Tuning, on today’ ... Show More
27m 8s
Aug 2024
AI in Action: From Machine Learning Interpretability to Cybersecurity with Serg Masís and Nirmal Budhathoki
In this DSS Podcast, Anna Anisin welcomes Serg Masís, Climate and Agronomic Data Scientist at Syngenta. Serg, an expert in machine learning interpretability and responsible AI, shares his diverse background and journey into data science. He discusses the challenges of building fa ... Show More
25m 37s
Dec 2024
Harvard Releases AI Training Dataset, Google Releases Gemini 2.0, and Two New Types of Infinity
We're experimenting and would love to hear from you!In today's episode of Discover Daily, we begin with a development for artificial intelligence research. Harvard University has unveiled a comprehensive AI training dataset, marking a significant step forward in democrat ... Show More
10m 21s
Sep 2024
Smart Talks with IBM: The power of Granite in business
As the scale of artificial intelligence continues to evolve, open technology like many of IBM’s Granite models are helping enhance transparency in AI and improve efficiency across businesses. In this episode of Smart Talks with IBM, Jacob Goldstein sat down with Maryam Ashoori, t ... Show More
32m 16s
Jul 2024
Altrove: Harnessing AI and Automation to Innovate Material Science
In this episode of "The AI Podcast," host Eli Schafer explores how the French startup Altrove is revolutionizing the field of material science. Discover how Altrove leverages advanced AI models and automated lab processes to predict and create new, stable materials, overcoming tr ... Show More
8m 13s
Jul 16
Can AI Accelerate Science? Dr. Andy Beam on AI’s Next Frontier
Dr. Andy Beam has trained models, mentored scientists, and used data to quantify the value of treatments. In this episode of NEJM AI Grand Rounds, Raj Manrai turns the table on his co-host, reflecting on how Andy’s childhood misdiagnosis, and the failure of human recall, revealed ... Show More
1h 7m
Jul 2024
#229 Inside Meta's Biggest and Best Open-Source AI Model Yet with Thomas Scialom, Co-Creator of Llama3
Meta has been at the absolute edge of the open-source AI ecosystem, and with the recent release of Llama 3.1, they have officially created the largest open-source model to date. So, what's the secret behind the performance gains of Llama 3.1? What will the future of open-source A ... Show More
39m 23s
Dec 2024
Adam Brown – How Future Civilizations Could Change The Laws of Physics
Adam Brown is a founder and lead of BlueShift with is cracking maths and reasoning at Google DeepMind and a theoretical physicist at Stanford.We discuss: destroying the light cone with vacuum decay, holographic principle, mining black holes, & what it would take to train LLMs tha ... Show More
2h 43m
Nov 2024
DataStax and the Future of Real-Time Data Applications with Jonathan Ellis
DataStax is known for its expertise in scalable data solutions, particularly for Apache Cassandra, a leading NoSQL database. Recently, the company has focused on enhancing platform support for AI-driven applications, including vector search capabilities. Jonathan Ellis is the Co- ... Show More
43m 24s