logo
episode-header-image
Jun 2024
26m 13s

Making ETL pipelines a thing of the past

The Stack Overflow Podcast
About this episode
RelationalAI’s first big partner is Snowflake, meaning customers can now start using their data with GenAI without worrying about the privacy, security, and governance hassle that would come with porting their data to a new cloud provider. The company promises it can also add metadata and a knowledge graph to existing data without pushing it through an ETL p ... Show More
Up next
May 1
Time is a construct but it can still break your software
Ryan welcomes Jason Williams, senior software engineer at Bloomberg and the creator of Rust-based JavaScript engine Boa, to the show to dive into why date and time handling in JavaScript is so difficult and how the Temporal proposal aims to fix it. They explore the current flaws ... Show More
35m 38s
Apr 28
Your LLM issues are really data issues
Ryan welcomes Harsha Chintalapani, co-founder and CTO at Collate and co-creator of Open Metadata, to the show to discuss why AI and LLMs struggle with real-time, structured production data. They explore how schema changes, inconsistent definitions (like “customer”), and weak gove ... Show More
31m 34s
Apr 24
Lights, camera, open source!
Ryan is joined on the show by Cult.Repo producers Emma Tracey and Josiah Mcgarvie to discuss making documentaries about open-source software and the people behind the major technologies that uphold the internet. They explore why open-source projects and the people who maintain th ... Show More
25m 33s
Recommended Episodes
Jun 2022
Simplify Data Security For Sensitive Information With The Skyflow Data Privacy Vault
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>The best way to make sure that you don&#8217;t leak sensitive data is to never have it in the first place. The team at Skyflow decided that the second best way is to build a storage system dedicated to securely managing ... Show More
54m 5s
May 2022
How to Link Data to Business Outcomes
<p><span style="font-weight: 400;">Every business likes to claim that it is "data-driven" or at least "data-informed," but too often, that's not the way things actually work. Data is relegated to an IT function, siloed and, in some cases, boils down to simply producing more repor ... Show More
17m 32s
Aug 2023
Unpacking The Seven Principles Of Modern Data Pipelines
<h2>Summary</h2> <p>Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you&#39;re not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of mod ... Show More
47m 3s
Jun 2020
Bringing Business Analytics To End Users With GoodData
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>The majority of analytics platforms are focused on use internal to an organization by business stakeholders. As the availability of data increases and overall literacy in how to interpret it and take action improves ther ... Show More
52m 24s
Oct 2022
How To Bring Agile Practices To Your Data Projects
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Agile methodologies have been adopted by a majority of teams for building software applications. Applying those same practices to data can prove challenging due to the number of systems that need to be included to implem ... Show More
1h 12m
Apr 2021
Moving Machine Learning Into The Data Pipeline at Cherre
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformation ... Show More
48m 5s
Nov 2021
Business Intelligence Beyond The Dashboard With ClicData
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Business intelligence is often equated with a collection of dashboards that show various charts and graphs representing data for an organization. What is overlooked in that characterization is the level of complexity and ... Show More
1h 2m
Dec 2021
The Top Trends in 2022 for Data Leaders from DataRobot, Databricks, and Google
<p>At the end of every year, you’re probably asking the same questions we are. What are the big changes coming next year? How do I stay ahead of them? And what’s separating real trends from the hype?</p><p>To answer these questions, we are excited to bring together some of the to ... Show More
1h 12m
Sep 2021
Declarative Machine Learning Without The Operational Overhead Using Continual
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Building, scaling, and maintaining the operational components of a machine learning workflow are all hard problems. Add the work of creating the model itself, and it&#8217;s not surprising that a majority of companies th ... Show More
1h 11m
Jan 2021
How Edgevana CEO Mark Thiele is Streamlining The Way Companies Access Data Centers
<p><a href="https://www.linkedin.com/in/markthiele/">Mark Thiele</a> has spent his entire life in and around IT infrastructure, even building his own fair share of data centers. But if there is one thing about the entire process that he finds vexing, it’s the wasted time between ... Show More
46m 5s