logo
episode-header-image
Feb 2025
1h 6m

863: TabPFN: Deep Learning for Tabular D...

Jon Krohn
About this episode

Jon Krohn talks tabular data with Frank Hutter, Professor of Artificial Intelligence at Universität Freiburg in Germany. Despite the great steps that deep learning has made in analysing images, audio, and natural language, tabular data has remained its insurmountable obstacle. In this episode, Frank Hutter details the path he has found around this obstacle even with limited data by using a ground-breaking transformer architecture. Named TabPFN, this approach is vastly outperforming other architectures, as testified by a write up of TabPFN’s capabilities in Nature. Frank talks about his work on version 2 of TabPFN, the architecture’s cross-industry applicability, and how TabPFN is able to return accurate results with synthetic data.


This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.


In this episode you will learn:

  • (05:57) All about the TabPFN architecture 
  • (21:27) Use cases for Bayesian inference
  • (35:07) On getting published in Nature
  • (44:03) How TabPFN handles time series data
  • (51:52) All about Prior Labs


Additional materials: www.superdatascience.com/863

Up next
Yesterday
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir
Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmar ... Show More
1h 28m
Jul 4
902: In Case You Missed It in June 2025
In this episode of “In Case You Missed It”, Jon recaps his June interviews on The SuperDataScience Podcast. Hear from Diane Hare, Avery Smith, Kirill Eremenko, and Shaun Johnson as they talk about the best portfolios for AI practitioners, how to stand out in a saturated candidate ... Show More
29m 29s
Jul 1
901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)
Senior Director of AI Labs for Epiq Lilith Bat-Leah speaks to Jon Krohn about the ways AI have disrupted the legal industry using LLMs and retrieval-augmented generation (RAG), as well as how the data-centric machine learning research movement (DMLR) is systematically improving d ... Show More
1h 6m
Recommended Episodes
Jan 2025
Breaking Down Data Silos: AI and ML in Master Data Management
Summary In this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) and artificial intelligence (AI) in master data management (MDM). Dan shares his journey from working at CERN to becoming a data ... Show More
57m 30s
Oct 2022
AI Today Podcast: Applying CPMAI in the Real World, Interview with Andrew Stone, Maximus
It’s one thing for us to talk about the Cognitive Project Management for AI (CPMAI) Methodology and the benefits it can bring to managers running AI and advanced data projects, but hearing directly how individuals are applying the CPMAI Methodology can be incredibly valuable. In ... Show More
47m 26s
Feb 2025
The Future of Data Engineering: AI, LLMs, and Automation
Summary In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large langua ... Show More
59m 39s
Jun 2024
#467: Data Science Panel at PyCon 2024
I have a special episode for you this time around. We're coming to you live from PyCon 2024. I had the chance to sit down with some amazing people from the data science side of things: Jodie Burchell, Maria Jose Molina-Contreras, and Jessica Greene. We cover a whole set of recent ... Show More
34m 40s
Dec 2024
Adam Brown – How Future Civilizations Could Change The Laws of Physics
Adam Brown is a founder and lead of BlueShift with is cracking maths and reasoning at Google DeepMind and a theoretical physicist at Stanford.We discuss: destroying the light cone with vacuum decay, holographic principle, mining black holes, & what it would take to train LLMs tha ... Show More
2h 43m
Sep 2024
Open Animal Tracks
Our guest today is Risa Shinoda, a PhD student at Kyoto University Agricultural Systems Engineering Lab, where she applies computer vision techniques. She talked about the OpenAnimalTracks dataset and what it was used for. The dataset helps researchers predict animal footprint. S ... Show More
22m 45s
Apr 2021
AI Today Podcast: Leading Data Scientists The Right Way, Interview with Ylan Kazi, UnitedHealth Group
As organizations continue to hire more data scientists it’s important to make sure they are being utilized to emphasize their skill sets. In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer interview Ylan Kazi, Vice President, Data Science and Machine L ... Show More
25m 39s
Nov 2024
#262 Self-Service Business Intelligence with Sameer Al-Sakran, CEO at Metabase
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.We’re often caught chasing the dream of “self-serve” data—a place where data empowers stakeholders to answer th ... Show More
51m 33s
Nov 2021
AI Today Podcast: Data Science through a Human Lens – Interview with Felipe Flores, host of Data Futurology Podcast
On the AI Today podcast we regularly interview thought leaders who are implementing AI and cognitive technology at various companies and agencies. However in this episode hosts Kathleen Walch and Ron Schmelzer interview Felipe Flores, host of Data Futurology Podcast. On his podca ... Show More
32m 10s
Aug 2024
Smart Talks with IBM: An AI advantage for the US Open
As AI technology progresses, its impact on our daily lives—including how we consume our favorite sports— will grow alongside it. In this episode of Smart Talks with IBM, Jacob Goldstein, host of Pushkin’s own What’s Your Problem?, sat down with Brian Ryerson, Senior Director of D ... Show More
33m 48s