logo
episode-header-image
Dec 2022
59m 22s

Declarative Machine Learning For High Pe...

Tobias Macey
About this episode

Preamble

This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning.

Summary

Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of Predibase, Travis Addair, explains how they are reducing the burden of model development even further with their managed service for declarative and low-code ML and how they are integrating with the growing ecosystem of solutions for the full ML lifecycle.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great!
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host is Tobias Macey and today I’m interviewing Travis Addair about Predibase, a low-code platform for building ML models in a declarative format

Interview

  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what Predibase is and the story behind it?
  • Who is your target audience and how does that focus influence your user experience and feature development priorities?
  • How would you describe the semantic differences between your chosen terminology of "declarative ML" and the "autoML" nomenclature that many projects and products have adopted?
    • Another platform that launched recently with a promise of "declarative ML" is Continual. How would you characterize your relative strengths?
  • Can you describe how the Predibase platform is implemented?
    • How have the design and goals of the product changed as you worked through the initial implementation and started working with early customers?
    • The operational aspects of the ML lifecycle are still fairly nascent. How have you thought about the boundaries for your product to avoid getting drawn into scope creep while providing a happy path to delivery?
  • Ludwig is a core element of your platform. What are the other capabilities that you are layering around and on top of it to build a differentiated product?
  • In addition to the existing interfaces for Ludwig you created a new language in the form of PQL. What was the motivation for that decision?
    • How did you approach the semantic and syntactic design of the dialect?
    • What is your vision for PQL in the space of "declarative ML" that you are working to define?
  • Can you describe the available workflows for an individual or team that is using Predibase for prototyping and validating an ML model?
    • Once a model has been deemed satisfactory, what is the path to production?
  • How are you approaching governance and sustainability of Ludwig and Horovod while balancing your reliance on them in Predibase?
  • What are some of the notable investments/improvements that you have made in Ludwig during your work of building Predibase?
  • What are the most interesting, innovative, or unexpected ways that you have seen Predibase used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Predibase?
  • When is Predibase the wrong choice?
  • What do you have planned for the future of Predibase?

Contact Info

Parting Question

  • From your perspective, what is the biggest barrier to adoption of machine learning today?

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers

Links

The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Up next
Dec 2022
Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River
Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary The majority of machine learning projects that you read about or work on are built around batch processes. The model i ... Show More
1h 16m
Nov 2022
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks
Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Machine learning has the potential to transform industries and revolutionize business capabilities, but only if the mo ... Show More
47m 37s
Nov 2022
Build A Full Stack ML Powered App In An Afternoon With Baseten
Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Building an ML model is getting easier than ever, but it is still a challenge to get that model in front of the people ... Show More
45m 22s
Recommended Episodes
Feb 2025
#495: OSMnx: Python and OpenStreetMap
On this episode, I’m joined by Dr. Jeff Boeing, an assistant professor at the University of Southern California whose research spans urban planning, spatial analysis, and data science. We explore why OpenStreetMap is such a powerful source of global map data—and how Jeff’s Python ... Show More
1h 1m
Sep 2021
An Exploration Of The Data Engineering Requirements For Bioinformatics
Summary Biology has been gaining a lot of attention in recent years, even before the pandemic. As an outgrowth of that popularity, a new field has grown up that pairs statistics and compuational analysis with scientific research, namely bioinformatics. This brings with it a uniqu ... Show More
55m 10s
May 2022
Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way
Summary Designing a data platform is a complex and iterative undertaking which requires accounting for many conflicting needs. Designing a platform that relies on a data lake as its central architectural tenet adds additional layers of difficulty. Srivatsan Sridharan has had the ... Show More
58m 11s
Mar 2021
Data Quality Management For The Whole Team With Soda Data
Summary Data quality is on the top of everyone’s mind recently, but getting it right is as challenging as ever. One of the contributing factors is the number of people who are involved in the process and the potential impact on the business if something goes wrong. In this episod ... Show More
58 m
Aug 2024
The Evolution of DataOps: Insights from DataKitchen's CEO
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures ... Show More
53m 30s
Feb 2025
The Future of Data Engineering: AI, LLMs, and Automation
Summary In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large langua ... Show More
59m 39s
Feb 2024
Using Trino And Iceberg As The Foundation Of Your Data Lakehouse
Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Multiple open source projects and vendors have been working together to make this vision a reality. In this ... Show More
58m 46s
Aug 2018
258: A Foot in the Door
This week, we debut the new show format! First, Marshall formally introduces himself, and we answer a listener's question about how to get their foot in the UX door. Then we cover a few headlines, fight about stock vs. third-party apps, and share a couple cool things. If you have ... Show More
38m 51s
Aug 2019
Building Tools And Platforms For Data Analytics
Summary Data engineers are responsible for building tools and platforms to power the workflows of other members of the business. Each group of users has their own set of requirements for the way that they access and interact with those platforms depending on the insights they are ... Show More
48m 7s
Dec 2024
The Art of Database Selection and Evolution
Summary In this episode of the Data Engineering Podcast Sam Kleinman talks about the pivotal role of databases in software engineering. Sam shares his journey into the world of data and discusses the complexities of database selection, highlighting the trade-offs between differen ... Show More
59m 56s