logo
episode-header-image
Oct 2024
1h 2m

825: Data Contracts: The Key to Data Qua...

Jon Krohn
About this episode

Data contracts are redefining data quality and governance, and Chad Sanderson, CEO of Gable.ai, joins host Jon Krohn to explain how they can transform your data strategy. He breaks down what data contracts are, how they shift data quality checks closer to production, and why they’re essential for reducing data debt. Chad also highlights how better alignment between data producers and consumers can elevate data reliability and tackle change-management challenges in modern organizations.


This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.


In this episode you will learn:

  • What data contracts are and how they define expectations for data quality [03:16]
  • What data contracts look like [09:09]
  • The common misconceptions about data quality when implementing AI [12:55]
  • Chad’s Chief Operator role at Data Quality Camp [19:46]
  • How “shifting left” improves data reliability by addressing issues early [24:17]
  • Why data professionals still struggle with data quality [30:31]
  • How data debt forms and why it leads to complex, inefficient architectures [35:53]
  • How will the role of human oversight evolve in ensuring data quality? [47:12]
  • How can data teams leverage storytelling? [52:33]


Additional materials: www.superdatascience.com/825

Up next
Jul 8
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir
Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmar ... Show More
1h 28m
Jul 4
902: In Case You Missed It in June 2025
In this episode of “In Case You Missed It”, Jon recaps his June interviews on The SuperDataScience Podcast. Hear from Diane Hare, Avery Smith, Kirill Eremenko, and Shaun Johnson as they talk about the best portfolios for AI practitioners, how to stand out in a saturated candidate ... Show More
29m 29s
Jul 1
901: Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)
Senior Director of AI Labs for Epiq Lilith Bat-Leah speaks to Jon Krohn about the ways AI have disrupted the legal industry using LLMs and retrieval-augmented generation (RAG), as well as how the data-centric machine learning research movement (DMLR) is systematically improving d ... Show More
1h 6m
Recommended Episodes
Jul 2024
Achieving Data Reliability: The Role of Data Contracts in Modern Data Management
Summary Data contracts are both an enforcement mechanism for data quality, and a promise to downstream consumers. In this episode Tom Baeyens returns to discuss the purpose and scope of data contracts, emphasizing their importance in achieving reliable analytical data and prevent ... Show More
49m 26s
Oct 2024
#254 Career Skills for Data Professionals with Wes Kao, Co-Founder of Maven
Mastering the technical side of data and AI is one thing, but communicating those insights effectively is a whole different challenge. How do you make sure your data is understood, acted upon, and influences decisions? It’s not just about presenting the right numbers—it’s about f ... Show More
46m 22s
Jan 2025
Breaking Down Data Silos: AI and ML in Master Data Management
Summary In this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) and artificial intelligence (AI) in master data management (MDM). Dan shares his journey from working at CERN to becoming a data ... Show More
57m 30s
Aug 2024
The Evolution of DataOps: Insights from DataKitchen's CEO
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures ... Show More
53m 30s
Feb 2025
Building Data Excellence at Nordstrom: Scaling Standards & Measurement for Impact
In this episode of the Data Science Salon Podcast, host Anna Anisin sits down with two data leaders from Nordstrom to explore how organizations can build a culture of technical excellence and measurement in data science. First, Gina Schmalzle, Principal Data Scientist at Nordstro ... Show More
34m 50s
Jun 2024
How Avangrid built a data foundation for AI
Mark Waclawiak was tuned into energy issues at an early age. Both his parents worked in the industry: his mom designed electrical systems for buildings and his dad worked at the utility. So the importance of electricity was always apparent to him.When he started working for a uti ... Show More
24m 35s
Oct 2024
Understanding the World: The Power of Data
If money makes the world go round, then data tells you how fast it’s spinning and when it might stop. 90% of all data was generated in the last 2 years and every 2 years the volume of data doubles. With 11 billion devices connected to the internet today, the annual global data ge ... Show More
28m 54s
Nov 2024
#259 Getting the Data For Your Data-Driven Decisions with Jonathan Bloch & Scott Voigt
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.Understanding where the data you use comes from, how to use it responsibly, and how to maximize its value has b ... Show More
46m 16s
Feb 2022
AI Today Podcast: Overview of Synthetic Data
Machine learning algorithms need examples of data from which they can learn, especially supervised machine learning algorithms. However, one big challenge for those looking to put machine learning into practice is the lack of a sufficient quantity of good quality data examples fr ... Show More
47m 14s
Nov 2024
#262 Self-Service Business Intelligence with Sameer Al-Sakran, CEO at Metabase
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.We’re often caught chasing the dream of “self-serve” data—a place where data empowers stakeholders to answer th ... Show More
51m 33s