logo
episode-header-image
Aug 2024
53m 30s

The Evolution of DataOps: Insights from ...

Tobias Macey
About this episode
Summary
In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves into the concept of DataOps, its evolution, and the misappropriation of related terms like data mesh and data observability. He emphasizes the importance of focusing on processes and systems rather than just tools to improve data engineering workflows. Chris also introduces DataKitchen's open-source tools, DataOps TestGen and DataOps Observability, designed to automate data quality validation and monitor data journeys in production.
Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.
  • Your host is Tobias Macey and today I'm interviewing Chris Bergh about his tireless quest to simplify the lives of data engineers
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what DataKitchen is and the story behind it?
  • You helped to define and popularize "DataOps", which then went through a journey of misappropriation similar to "DevOps", and has since faded in use. What is your view on the realities of "DataOps" today?
  • Out of the popularized wave of "DataOps" tools came subsequent trends in data observability, data reliability engineering, etc. How have those cycles influenced the way that you think about the work that you are doing at DataKitchen?
  • The data ecosystem went through a massive growth period over the past ~7 years, and we are now entering a cycle of consolidation. What are the fundamental shifts that we have gone through as an industry in the management and application of data?
  • What are the challenges that never went away?
  • You recently open sourced the dataops-testgen and dataops-observability tools. What are the outcomes that you are trying to produce with those projects?
  • What are the areas of overlap with existing tools and what are the unique capabilities that you are offering?
  • Can you talk through the technical implementation of your new obserability and quality testing platform?
  • What does the onboarding and integration process look like?
  • Once a team has one or both tools set up, what are the typical points of interaction that they will have over the course of their workday?
  • What are the most interesting, innovative, or unexpected ways that you have seen dataops-observability/testgen used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on promoting DataOps?
  • What do you have planned for the future of your work at DataKitchen?
Contact Info
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Up next
Yesterday
Blurring Lines: Data, AI, and the New Playbook for Team Velocity
Summary<br />In this crossover episode, Max Beauchemin explores how multiplayer, multi‑agent engineering is transforming the way individuals and teams build data and AI systems. He digs into the shifting boundary between data and AI engineering, the rise of “context as code,” and ... Show More
1 h
Nov 16
State, Scale, and Signals: Rethinking Orchestration with Durable Execution
Summary&nbsp;<br />In this episode Preeti Somal, EVP of Engineering at Temporal, talks about the durable execution model and how it reshapes the way teams build reliable, stateful systems for data and AI. She explores Temporal’s code‑first programming model—workflows, activities, ... Show More
51m 46s
Nov 9
The AI Data Paradox: High Trust in Models, Low Trust in Data
Summary<br />In this episode of the Data Engineering Podcast Ariel Pohoryles, head of product marketing for Boomi's data management offerings, talks about a recent survey of 300 data leaders on how organizations are investing in data to scale AI. He shares a paradox uncovered in ... Show More
51m 35s
Recommended Episodes
Mar 2025
#295 How To Get Hired As A Data Or AI Engineer with Deepak Goyal, CEO & Founder at Azurelib Academy
The role of data and AI engineers is more critical than ever. With organizations collecting massive amounts of data, the challenge lies in building efficient data infrastructures that can support AI systems and deliver actionable insights. But what does it take to become a succes ... Show More
52m 27s
Nov 2024
#262 Self-Service Business Intelligence with Sameer Al-Sakran, CEO at Metabase
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.We’re often caught chasing the dream of “self-serve” data—a place where data empowers stakeholders to answer th ... Show More
51m 33s
Sep 9
Leading across technical domains, strategic deep-dives & applying your skills in new industries w/ Simone Kalmakis #231
<p>How do you apply your leadership skills to a new, mission-driven industry and effectively lead teams across multiple technical domains? In this episode, Simone Kalmakis (VPE @ Viam) shares her playbook for successfully transitioning between industries from health-tech and clim ... Show More
43m 17s
Jul 2022
IoT, IIoT and Managing Edge Data
<p>Brian Gilmore (@BrianMGilmore, Director IoT/Emerging Technology @InfluxDB) talks about Edge and Industrial Edge Computing, as well as application and data challenges at the edge.</p><p><b>SHOW: 634</b></p><p><b>CLOUD NEWS OF THE WEEK - </b><a href='http://bit.ly/cloudcast-cnot ... Show More
35m 37s
Sep 24
Hypergrowth startups: Uber and CloudKitchens with Charles-Axel Dein
Brought to You By:•⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. Statsig built a complete set of data tools that allow engineering teams to measure the impact of their work. This toolkit is SO valuable to so many teams, that OpenAI - who was a ... Show More
1h 44m
Sep 18
Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody (CEO of Mercor)
Brendan Foody is the CEO and co-founder of Mercor, the fastest-growing company in history to go from $1M to $500M in revenue (in just 17 months!). At 22, he is also the youngest American unicorn founder ever. Mercor works with 6 of the Magnificent 7 and all top 5 AI labs to help ... Show More
1h 6m
Apr 2025
Specialized AI brains for physical industry
Everyone wants a piece of general purpose models. Instacart has deployed ChatGPT for recipes and meal planning. The Mayo Clinic is using it to summarize patient records. Schneider Electric is using an OpenAI LLM to generate sustainability reports. With such powerful models, what’ ... Show More
37m 2s
Sep 15
#321 Developing Financial AI Products at Experian with Vijay Mehta, EVP of Global Solutions & Analytics at Experian
Financial institutions are racing to harness the power of AI, but the path to implementation is filled with challenges. From feature engineering to model deployment, the technical complexities of AI adoption in finance require careful navigation of both technological and regulato ... Show More
49m 28s
Jan 2025
3164: Breaking Data Silos: How Hammerspace is Powering AI Storage and Hybrid Cloud
<p>As part of the IT Press Tour in Silicon Valley, I had the opportunity to sit down with David Flynn, CEO of Hammerspace, to explore how the company is redefining the future of enterprise data storage.</p> <p>At a time when AI-driven workloads and hybrid cloud computing are push ... Show More
24m 26s
Sep 10
Code Complete with Steve McConnell
Brought to You By:•⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. Statsig built a complete set of data tools that allow engineering teams to measure the impact of their work. This toolkit is SO valuable to so many teams, that OpenAI - who was a ... Show More
1h 30m