logo
episode-header-image
Jul 2018
59m 23s

#29 Machine Learning & Data Science at G...

DATACAMP
About this episode

Omoju Miller, a Senior Machine Learning Data Scientist with Github, speaks with Hugo about the role of data science in product development at github, what it means to “use computation to build products to solve real-life decision making, practical challenges” and what building data products at github actually looks like.

Machine learning has the power to automate so much of the drudgery around data science & software engineering, from automated code review to flagging security vulnerabilities in code, and from recommending repositories to contributors to matching issues with maintainers and contributors and identifying duplicate issues.

And just in case that’s not enough, they'll discuss github as a platform for work, not just technical, and, as Omoju has called it, “a collaborative work environment centered around humans.”

Up next
Today
#317 How to Reengineer Your Business Processes with Nelson Repenning, Distinguished Professor at MIT Sloan & Don Kieffer, Senior Lecturer in Operations Management at MIT Sloan
Every day, knowledge workers face the challenge of managing competing priorities and constant interruptions. When systems are managing us rather than us managing them, productivity suffers and morale plummets. But what if the key to improvement isn't complex reorganization but ra ... Show More
1h 6m
Aug 18
#316 Enterprise AI Agents with Jun Qian, VP of Generative AI Services at Oracle
Combining LLMs with enterprise knowledge bases is creating powerful new agents that can transform business operations. These systems are dramatically improving on traditional chatbots by understanding context, following conversations naturally, and accessing up-to-date informatio ... Show More
56m 36s
Aug 13
#315 DataFramed x Alter Everything: Future-Proofing Your Career in AI and Data Analytics | Richie & Megan Bowers
The relationship between AI and data professionals is evolving rapidly, creating both opportunities and challenges. As companies embrace AI-first strategies and experiment with AI agents, the skills needed to thrive in data roles are fundamentally changing. Is coding knowledge st ... Show More
41m 5s
Recommended Episodes
Feb 2019
Machine Learning In The Enterprise
Summary Machine learning is a class of technologies that promise to revolutionize business. Unfortunately, it can be difficult to identify and execute on ways that it can be used in large companies. Kevin Dewalt founded Prolego to help Fortune 500 companies build, launch, and mai ... Show More
48m 19s
Jan 2022
Making Agile work for data science
Data scientists and engineers don’t always play well together. Data scientists will plan out a solution, carefully build models, test them in notebooks, then throw that solution over the wall to engineering. Implementing that solution can take months.Historically, the data scienc ... Show More
20m 52s
May 2021
Buy AND Build for Production Machine Learning with Nir Bar-Lev - #488
Today we’re joined by Nir Bar-Lev, co-founder and CEO of ClearML. In our conversation with Nir, we explore how his view of the wide vs deep machine learning platforms paradox has changed and evolved over time, how companies should think about building vs buying and integration, a ... Show More
43m 24s
Mar 2021
Bridging The Gap Between Machine Learning And Operations At Iguazio
Summary The process of building and deploying machine learning projects requires a staggering number of systems and stakeholders to work in concert. In this episode Yaron Haviv, co-founder of Iguazio, discusses the complexities inherent to the process, as well as how he has worke ... Show More
1h 6m
Mar 2022
Bayesian Machine Learning with Ravin Kumar (Ep. 191)
This is one episode where passion for math, statistics and computers are merged. I have a very interesting conversation with Ravin,  data scientist at Google where he uses data to inform decisions. He has previously worked at Sweetgreen, designing systems that would benefit team ... Show More
31m 12s
Jan 2022
Automated Data Quality Management Through Machine Learning With Anomalo
Summary Data quality control is a requirement for being able to trust the various reports and machine learning models that are relying on the information that you curate. Rules based systems are useful for validating known requirements, but with the scale and complexity of data i ... Show More
1h 2m
Jul 2021
Exploring The Design And Benefits Of The Modern Data Stack
Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of clou ... Show More
49m 2s
Apr 2021
Moving Machine Learning Into The Data Pipeline at Cherre
Summary Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformations is actually a full-fledged machine learning projec ... Show More
48m 5s
Sep 2021
Declarative Machine Learning Without The Operational Overhead Using Continual
Summary Building, scaling, and maintaining the operational components of a machine learning workflow are all hard problems. Add the work of creating the model itself, and it’s not surprising that a majority of companies that could greatly benefit from machine learning have yet to ... Show More
1h 11m
Jun 2021
Lessons Learned From The Pipeline Data Engineering Academy
Summary Data Engineering is a broad and constantly evolving topic, which makes it difficult to teach in a concise and effective manner. Despite that, Daniel Molnar and Peter Fabian started the Pipeline Academy to do exactly that. In this episode they reflect on the lessons that t ... Show More
1h 11m