logo
episode-header-image
Jul 2018
59m 23s

#29 Machine Learning & Data Science at G...

DATACAMP
About this episode

Omoju Miller, a Senior Machine Learning Data Scientist with Github, speaks with Hugo about the role of data science in product development at github, what it means to “use computation to build products to solve real-life decision making, practical challenges” and what building data products at github actually looks like.

Machine learning has the power to automate so much of the drudgery around data science & software engineering, from automated code review to flagging security vulnerabilities in code, and from recommending repositories to contributors to matching issues with maintainers and contributors and identifying duplicate issues.

And just in case that’s not enough, they'll discuss github as a platform for work, not just technical, and, as Omoju has called it, “a collaborative work environment centered around humans.”

Up next
Jul 8
#309 What Science Fiction Can Tell Us About the Future of AI with Ken Liu, Sci-Fi Author
Technology and human consciousness are converging in ways that challenge our fundamental understanding of creativity and connection. As AI systems become increasingly sophisticated at mimicking human thought patterns, we're entering uncharted territory where machines don't just a ... Show More
1h 17m
Jul 3
Industry Roundup #5: AI Agents Hype vs. Reality, Meta’s $15B Stake in Scale AI, and the First Fully AI-Generated NBA Ad
Welcome to DataFramed Industry Roundups! In this series of episodes, we sit down to discuss the latest and greatest in data & AI. In this episode, with special guest, DataCamp COO Martijn, we touch upon the hype and reality of AI agents in business, the McKinsey vs. Ethan Mollick ... Show More
53m 2s
Jun 30
#308 A Framework for GenAI App and Agent Development with Jerry Liu, CEO at LlamaIndex
The enterprise adoption of AI agents is accelerating, but significant challenges remain in making them truly reliable and effective. While coding assistants and customer service agents are already delivering value, more complex document-based workflows require sophisticated archi ... Show More
52m 21s
Recommended Episodes
Feb 2019
Machine Learning In The Enterprise
Summary Machine learning is a class of technologies that promise to revolutionize business. Unfortunately, it can be difficult to identify and execute on ways that it can be used in large companies. Kevin Dewalt founded Prolego to help Fortune 500 companies build, launch, and mai ... Show More
48m 19s
Jan 2022
Making Agile work for data science
Data scientists and engineers don’t always play well together. Data scientists will plan out a solution, carefully build models, test them in notebooks, then throw that solution over the wall to engineering. Implementing that solution can take months.Historically, the data scienc ... Show More
20m 52s
May 2021
Buy AND Build for Production Machine Learning with Nir Bar-Lev - #488
Today we’re joined by Nir Bar-Lev, co-founder and CEO of ClearML. In our conversation with Nir, we explore how his view of the wide vs deep machine learning platforms paradox has changed and evolved over time, how companies should think about building vs buying and integration, a ... Show More
43m 24s
Mar 2021
Bridging The Gap Between Machine Learning And Operations At Iguazio
Summary The process of building and deploying machine learning projects requires a staggering number of systems and stakeholders to work in concert. In this episode Yaron Haviv, co-founder of Iguazio, discusses the complexities inherent to the process, as well as how he has worke ... Show More
1h 6m
Mar 2022
Bayesian Machine Learning with Ravin Kumar (Ep. 191)
This is one episode where passion for math, statistics and computers are merged. I have a very interesting conversation with Ravin,  data scientist at Google where he uses data to inform decisions. He has previously worked at Sweetgreen, designing systems that would benefit team ... Show More
31m 12s
Jan 2022
Automated Data Quality Management Through Machine Learning With Anomalo
Summary Data quality control is a requirement for being able to trust the various reports and machine learning models that are relying on the information that you curate. Rules based systems are useful for validating known requirements, but with the scale and complexity of data i ... Show More
1h 2m
Jul 2021
Exploring The Design And Benefits Of The Modern Data Stack
Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of clou ... Show More
49m 2s
Apr 2021
Moving Machine Learning Into The Data Pipeline at Cherre
Summary Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformations is actually a full-fledged machine learning projec ... Show More
48m 5s
Sep 2021
Declarative Machine Learning Without The Operational Overhead Using Continual
Summary Building, scaling, and maintaining the operational components of a machine learning workflow are all hard problems. Add the work of creating the model itself, and it’s not surprising that a majority of companies that could greatly benefit from machine learning have yet to ... Show More
1h 11m
Jun 2021
Lessons Learned From The Pipeline Data Engineering Academy
Summary Data Engineering is a broad and constantly evolving topic, which makes it difficult to teach in a concise and effective manner. Despite that, Daniel Molnar and Peter Fabian started the Pipeline Academy to do exactly that. In this episode they reflect on the lessons that t ... Show More
1h 11m