logo
episode-header-image
Aug 2024
53m 30s

The Evolution of DataOps: Insights from ...

Tobias Macey
About this episode
Summary
In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves into the concept of DataOps, its evolution, and the misappropriation of related terms like data mesh and data observability. He emphasizes the importance of focusing on processes and systems rather than just tools to improve data engineering workflows. Chris also introduces DataKitchen's open-source tools, DataOps TestGen and DataOps Observability, designed to automate data quality validation and monitor data journeys in production.
Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.
  • Your host is Tobias Macey and today I'm interviewing Chris Bergh about his tireless quest to simplify the lives of data engineers
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what DataKitchen is and the story behind it?
  • You helped to define and popularize "DataOps", which then went through a journey of misappropriation similar to "DevOps", and has since faded in use. What is your view on the realities of "DataOps" today?
  • Out of the popularized wave of "DataOps" tools came subsequent trends in data observability, data reliability engineering, etc. How have those cycles influenced the way that you think about the work that you are doing at DataKitchen?
  • The data ecosystem went through a massive growth period over the past ~7 years, and we are now entering a cycle of consolidation. What are the fundamental shifts that we have gone through as an industry in the management and application of data?
  • What are the challenges that never went away?
  • You recently open sourced the dataops-testgen and dataops-observability tools. What are the outcomes that you are trying to produce with those projects?
  • What are the areas of overlap with existing tools and what are the unique capabilities that you are offering?
  • Can you talk through the technical implementation of your new obserability and quality testing platform?
  • What does the onboarding and integration process look like?
  • Once a team has one or both tools set up, what are the typical points of interaction that they will have over the course of their workday?
  • What are the most interesting, innovative, or unexpected ways that you have seen dataops-observability/testgen used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on promoting DataOps?
  • What do you have planned for the future of your work at DataKitchen?
Contact Info
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Up next
Jul 6
Foundational Data Engineering At 2Sigma
SummaryIn this episode of the Data Engineering Podcast Effie Baram, a leader in foundational data engineering at Two Sigma, talks about the complexities and innovations in data engineering within the finance sector. She discusses the critical role of data at Two Sigma, balancing ... Show More
55m 5s
Jun 29
Enabling Agents In The Enterprise With A Platform Approach
SummaryIn this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with agentic capabilities. From leading AI engineering at Deutsche Telekom to his current entrepreneurial venture focused on multi-agen ... Show More
54m 18s
Jun 18
Dagster's New Era: Modularizing Data Transformation in the Age of AI
SummaryIn this episode of the Data Engineering Podcast we welcome back Nick Schrock, CTO and founder of Dagster Labs, to discuss the evolving landscape of data engineering in the age of AI. As AI begins to impact data platforms and the role of data engineers, Nick shares his insi ... Show More
1h 1m
Recommended Episodes
Nov 2024
#262 Self-Service Business Intelligence with Sameer Al-Sakran, CEO at Metabase
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.We’re often caught chasing the dream of “self-serve” data—a place where data empowers stakeholders to answer th ... Show More
51m 33s
Nov 2024
#259 Getting the Data For Your Data-Driven Decisions with Jonathan Bloch & Scott Voigt
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.Understanding where the data you use comes from, how to use it responsibly, and how to maximize its value has b ... Show More
46m 16s
Dec 2024
Best of 2024: The Art of Prompt Engineering with Alex Banks, Founder and Educator, Sunday Signal
As we look back at 2024, we're highlighting some of our favourite episodes of the year, and with 100 of them to choose from, it wasn't easy!The four guests we'll be recapping with are:Lea Pica - A celebrity in the data storytelling and visualisation space. Richie and Lea cover th ... Show More
44m 58s
Apr 2023
2344: Cloudera: Moving Beyond Big Data to Hybrid Data Mastery
I sit down with Chris Royles, EMEA Field CTO at Cloudera, to discuss the evolution of Big Data and why hybrid data is the next challenge for businesses to tackle. In this episode, we explore how the term 'Big Data' has become dated and how the rapid rise of hybrid data has shifte ... Show More
39m 54s
Feb 2025
#282 Navigating the Challenges of Product Integrations with Gil Feig, Co-Founder and CTO of Merge
As the software landscape becomes more fragmented, the importance of product integrations continues to rise. For those working in data and engineering roles, this presents both challenges and opportunities. How do you efficiently manage and scale integrations across diverse syste ... Show More
27m 46s
Jul 2024
A Story On Data: Evolution, Technology and More w/ Mohammad Mortada | Below The Fold
In this episode of 'Below the Fold,' Mohamed and Ibrahim dive into the dynamics of data in marketing. With Mohamed's extensive background in advertising and customer experience at Oracle, and Ibrahim's expertise in programmatic marketing, they discuss how data usage has evolved f ... Show More
46m 25s
Aug 2024
Driving Supply Chain Solutions for Life Sciences with AI - with Andrei Tadique of Takeda
Today's guest is Andrei Tadique, Director and Head of Manufacturing Science at Takeda Pharmaceuticals. Andrei joins us on today's podcast to discuss the biggest challenges for Life Sciences leaders in driving logistics and supply chain workflows. In the course of his conversation ... Show More
24m 19s
Jan 2025
The Role of Analytics in Shaping the Future of MLOps
Sophia Rowland, Senior Product Manager at SAS, discusses her journey from data science to product management at SAS, focusing on the integration of AI and analytics. She explains the concepts of Model Ops and ML Ops, the challenges organizations face in operationalizing machine l ... Show More
32m 42s
Oct 2024
Why Human Data is Key to AI: Alexandr Wang from Scale AI
In this conversation with a16z general partner David George, Scale AI founder and CEO Alexandr Wang discusses the three pillars of AI—models, compute, and data—and how creating abundant data is core to the evolution of gen AI. With Scale’s work across enterprise, automotive, and ... Show More
35m 8s
Sep 2024
The tool product managers love and is disrupting Jira | Karri Saarinen, CEO at Linear | E234
In this episode of the Product Podcast we chat with Karri Saarinen, the CEO at Linear. It's the fastest-growing and most beloved project management tool in the world. The company is valued at $400 million, and has raised $52 million in funding from Accel, Sequoia, and some o ... Show More
31m 29s