In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally.
David introduces the concept of "power niche users" - highly active users with specialized interests who generate valuable data that can benefit the entire platform. We discuss his paper "When Collaborative Filtering Is Not Collaborative," which reveals how PCA can over-specialize on popular content while neglecting both niche items and even failing to properly recommend popular artists to new potential fans. David presents solutions through item-weighted PCA and thoughtful data upweighting strategies that can improve both fairness and performance simultaneously, challenging the common assumption that these goals must be in tension. The conversation spans from theoretical insights to practical applications at companies like Meta, offering a comprehensive look at the future of personalized recommendations.