logo
episode-header-image
Feb 2025
1h 6m

863: TabPFN: Deep Learning for Tabular D...

Jon Krohn
About this episode

Jon Krohn talks tabular data with Frank Hutter, Professor of Artificial Intelligence at Universität Freiburg in Germany. Despite the great steps that deep learning has made in analysing images, audio, and natural language, tabular data has remained its insurmountable obstacle. In this episode, Frank Hutter details the path he has found around this obstacle even with limited data by using a ground-breaking transformer architecture. Named TabPFN, this approach is vastly outperforming other architectures, as testified by a write up of TabPFN’s capabilities in Nature. Frank talks about his work on version 2 of TabPFN, the architecture’s cross-industry applicability, and how TabPFN is able to return accurate results with synthetic data.


This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.


In this episode you will learn:

  • (05:57) All about the TabPFN architecture 
  • (21:27) Use cases for Bayesian inference
  • (35:07) On getting published in Nature
  • (44:03) How TabPFN handles time series data
  • (51:52) All about Prior Labs


Additional materials: www.superdatascience.com/863

Up next
Aug 22
916: The 5 Key GPT-5 Takeaways
GPT-5 has just been released, but with not very much fanfare. In this Five-Minute Friday, Jon Krohn asks if GPT-5 deserves the community’s underwhelmed response to its release. He outlines five features of the model and explains why people might be feeling less than enthusiastic ... Show More
9m 40s
Aug 19
915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi
Tech leader, investor, and Generationship cofounder Michelle Yi talks to Jon Krohn about finding ways to trust and secure AI systems, the methods that hackers use to jailbreak code, and what users can do to build their own trustworthy AI systems. Learn all about “red teaming” and ... Show More
1h 9m
Aug 15
914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz
In this Five-Minute Friday, Cofounder and CTO of lakeFS Oz Katz talks to Jon Krohn about data warehouses, data lakes, and how companies can handle increasingly complex data infrastructures and formats. Hear about lakeFS’s collaboration with Legofest, lakeFS’s approach to helping ... Show More
25m 52s
Recommended Episodes
Jan 2025
Breaking Down Data Silos: AI and ML in Master Data Management
Summary In this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) and artificial intelligence (AI) in master data management (MDM). Dan shares his journey from working at CERN to becoming a data ... Show More
57m 30s
Oct 2022
AI Today Podcast: Applying CPMAI in the Real World, Interview with Andrew Stone, Maximus
It’s one thing for us to talk about the Cognitive Project Management for AI (CPMAI) Methodology and the benefits it can bring to managers running AI and advanced data projects, but hearing directly how individuals are applying the CPMAI Methodology can be incredibly valuable. In ... Show More
47m 26s
Feb 2025
The Future of Data Engineering: AI, LLMs, and Automation
Summary In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large langua ... Show More
59m 39s
Jun 2024
#467: Data Science Panel at PyCon 2024
I have a special episode for you this time around. We're coming to you live from PyCon 2024. I had the chance to sit down with some amazing people from the data science side of things: Jodie Burchell, Maria Jose Molina-Contreras, and Jessica Greene. We cover a whole set of recent ... Show More
34m 40s
Dec 2024
Adam Brown – How Future Civilizations Could Change The Laws of Physics
Adam Brown is a founder and lead of BlueShift with is cracking maths and reasoning at Google DeepMind and a theoretical physicist at Stanford.We discuss: destroying the light cone with vacuum decay, holographic principle, mining black holes, & what it would take to train LLMs tha ... Show More
2h 43m
Sep 2024
Open Animal Tracks
Our guest today is Risa Shinoda, a PhD student at Kyoto University Agricultural Systems Engineering Lab, where she applies computer vision techniques. She talked about the OpenAnimalTracks dataset and what it was used for. The dataset helps researchers predict animal footprint. S ... Show More
22m 45s
Nov 2024
#262 Self-Service Business Intelligence with Sameer Al-Sakran, CEO at Metabase
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.We’re often caught chasing the dream of “self-serve” data—a place where data empowers stakeholders to answer th ... Show More
51m 33s
Nov 2021
AI Today Podcast: Data Science through a Human Lens – Interview with Felipe Flores, host of Data Futurology Podcast
On the AI Today podcast we regularly interview thought leaders who are implementing AI and cognitive technology at various companies and agencies. However in this episode hosts Kathleen Walch and Ron Schmelzer interview Felipe Flores, host of Data Futurology Podcast. On his podca ... Show More
32m 10s
Aug 2024
Smart Talks with IBM: An AI advantage for the US Open
As AI technology progresses, its impact on our daily lives—including how we consume our favorite sports— will grow alongside it. In this episode of Smart Talks with IBM, Jacob Goldstein, host of Pushkin’s own What’s Your Problem?, sat down with Brian Ryerson, Senior Director of D ... Show More
33m 48s