logo
episode-header-image
Feb 2025
59m 39s

The Future of Data Engineering: AI, LLMs...

Tobias Macey
About this episode
Summary
In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large language models (LLMs) to enhance productivity and reduce manual toil. The conversation covers the potential of AI to transform data engineering tasks, such as text-to-SQL interfaces and creating semantic graphs to improve data accessibility, and explores practical applications of LLMs in automating code reviews, testing, and understanding data lineage.


Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. 
  • Your host is Tobias Macey and today I'm interviewing Gleb Mezhanskiy about 
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • modern data stack is dead
  • where is AI in the data stack?
  • "buy our tool to ship AI"
  • opportunities for LLM in DE workflow
Contact Info
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Up next
Oct 5
The Data Model That Captures Your Business: Metric Trees Explained
SummaryIn this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data modeling that directly captures a company's business model. Vijay shares insights from his decade-long experience building data pr ... Show More
1h 1m
Sep 28
From GPUs-as-a-Service to Workloads-as-a-Service: Flex AI’s Path to High-Utilization AI Infra
SummaryIn this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC archite ... Show More
56m 31s
Sep 18
From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture
SummaryIn this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to m ... Show More
52m 58s
Recommended Episodes
Aug 2024
Only as good as the data
You might have heard that “AI is only as good as the data.” What does that mean and what data are we talking about? Chris and Daniel dig into that topic in the episode exploring the categories of data that you might encounter working in AI (for training, testing, fine-tuning, ben ... Show More
45m 41s
Sep 17
GPT-5-Codex and the Year of Agentic Coding
Today on the AI Daily Brief, OpenAI launches GPT 5 Codex, a model designed for real-world software engineering with dynamic reasoning, long-task persistence, and powerful code review capabilities. We break down why this release cements 2025 as the year of agentic coding and what ... Show More
28m 45s
Sep 18
How People Actually Use ChatGPT
This episode of AI Daily Brief dives into two important reports on how people are really using AI tools like ChatGPT and Claude. OpenAI’s massive study with Harvard and NBER reveals consumer patterns across 1.5 million conversations, while Anthropic’s Economic Index tracks broade ... Show More
27m 39s
Aug 2024
809: Agentic AI, with Shingai Manjengwa
Agentic AI is revolutionizing the tech landscape, and Shingai Manjengwa from ChainML is here to tell us why. Discover how AI agents are becoming an integral part of our lives, automating tasks like travel bookings and daily inspiration. Shingai explains the power of multi-agent s ... Show More
1h 10m
Sep 13
The AI Office Tools That Actually Work
Not all AI office tools live up to the hype. Today we dig into new surveys, enterprise spending data, and an a16z analysis to uncover which AI tools actually perform in real-world workflows. From slides and spreadsheets to email, research, and meeting notes—we break down the tool ... Show More
26m 12s
Mar 2025
Feed Drop: How AI Will Change Your Job: MIT’s David Autor
Today’s episode is a bonus drop from our friends over at the MIT CSAIL Alliances podcast. We’ll back in two weeks for Season 11 of Me, Myself, and AI. David Autor, the Daniel (1972) and Gail Rubinfeld Professor, Margaret MacVicar Faculty Fellow in MIT’s Department of Economics, s ... Show More
40m 18s
Aug 27
Amperity Reimagines Data and Developer Workflows with AI - Ep. 271
Derek Slager, co-founder and CTO of Amperity, explores how agentic AI and vibe coding are reshaping enterprise data management and the developer experience on the NVIDIA AI Podcast. Hear how Amperity’s platform unifies customer data, powers advanced analytics, and brings conversa ... Show More
36m 40s
Sep 2024
AI is more than GenAI
GenAI is often what people think of when someone mentions AI. However, AI is much more. In this episode, Daniel breaks down a history of developments in data science, machine learning, AI, and GenAI in this episode to give listeners a better mental model. Don’t miss this one if y ... Show More
40m 3s
Jun 2025
Architecting AI-Driven Financial Systems: Innovation at the Intersection of Fintech and Emerging Tech
In this episode of the Data Science Salon Podcast, we sit down with Sasibhushan Rao Chanthati, AVP and Senior Software Engineer at T. Rowe Price, where he’s building the future of finance through intelligent, scalable technologies. Sasi specializes in creating secure digital ecos ... Show More
29m 7s
Oct 2
When Will AI Make Scientific Discoveries?
Today’s AI Daily Brief asks when artificial intelligence will begin making real scientific discoveries. We look at Periodic Labs, which just raised more than $300 million to build AI scientists and autonomous labs for physics and chemistry, and Thinking Machines, which is creatin ... Show More
24m 25s