logo
episode-header-image
Nov 5
34m 48s

Shilling Attacks on Recommender Systems

Kyle Polich
About this episode

In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a friend's ska band on Spotify to inflating product ratings on e-commerce platforms, shilling attacks represent a significant threat in an industry where approximately 4% of reviews are fake, translating to $800 billion in annual sales in the US alone.

The discussion delves deep into collaborative filtering, explaining both user-user and item-item approaches that create similarity matrices to predict user preferences. However, these systems face various shilling attacks of increasing sophistication: random attacks use minimal information with average ratings, while segmented attacks strategically target popular items (like Taylor Swift albums) to build credibility before promoting target items. Bandwagon attacks focus on highly popular items to connect with genuine users, and average attacks leverage item rating knowledge to appear authentic. User-user collaborative filtering proves particularly vulnerable, requiring as few as 500 fake profiles to impact recommendations, while item-item filtering demands significantly more resources. Aditya addresses detection through machine learning techniques that analyze behavioral patterns using methods like PCA to identify profiles with unusually high correlation and suspicious rating consistency. However, this remains an evolving challenge as attackers adapt strategies, now using large language models to generate more authentic-seeming fake reviews. His research with the MovieLens dataset tested detection algorithms against synthetic attacks, highlighting how these concerns extend to modern e-commerce systems. While companies rarely share attack and detection data publicly to avoid giving attackers advantages, academic research continues advancing both offensive and defensive strategies in recommender systems security.

Up next
Oct 29
Music Playlist Recommendations
In this episode, Rebecca Salganik, a PhD student at the University of Rochester with a background in vocal performance and composition, discusses her research on fairness in music recommendation systems. She explores three key types of fairness—group, individual, and counterfactu ... Show More
52m 29s
Oct 15
Bypassing the Popularity Bias
34m 33s
Oct 9
Sustainable Recommender Systems for Tourism
In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable to ... Show More
38m 2s
Recommended Episodes
Jun 2025
893: How to Jumpstart Your Data Career (by Applying Like a Scientist), with Avery Smith
Avery Smith is a passionate and motivational YouTuber and careers educator for data science. In this episode, Jon Krohn asks Avery about the tools and tricks he has learned from personal experience and from his students in how to get ahead in the tech industry. Avery shares the “ ... Show More
1h 17m
May 2023
The Science You’ve Enabled
The Planetary Society has just announced the latest awards in its Science and Technology Enabled by the Public (STEP) Grant program. With regular host Sarah Al-Ahmed on vacation, Mat Kaplan returns to introduce the principal investigator for a project that will prepare us to grow ... Show More
50m 53s
Sep 2024
Data-Driven Excellence: AI and Analytics in Action with Matthew Denesuk & Jaime Russ
In this DSS Podcast we chat with Matthew Denesuk, SVP of Data Analytics & AI at Royal Caribbean Group. Matthew shares his insights on leveraging a Center of Excellence model to drive data-driven strategies across the organization. Tune in to discover how this approach can transfo ... Show More
32m 41s
Dec 2024
849: 2025 AI and Data Science Predictions, with Sadie St. Lawrence
Sadie St Lawrence returns for her 4th annual prediction episode on the Super Data Science Podcast. Together with host Jon Krohn, they reflect on 2024’s most transformative trends—like agentic AI and enterprise AI monetization—and predict what's coming in 2025, from AI-driven scie ... Show More
1h 18m
Oct 2021
AI Today Podcast: Data science in the Enterprise: Interview with Sanyam Bhutani, host of Chai Time Data Science podcast
On the AI Today podcast we regularly interview thought leaders who are implementing AI and cognitive technology at various companies and agencies. However in this episode hosts Kathleen Walch and Ron Schmelzer interview Sanyam Bhutani, host of Chai Time Data Science podcast. As h ... Show More
23m 38s
Jan 2025
Does It Fly? Putting science in entertainment to the test
This week, we discuss how to examine the science behind our favorite TV shows and movies with the co-hosts of the "Does It Fly?" podcast, Hakeem Oluseyi and Tamara Krinsky. Produced by Roddenberry Entertainment, "Does It Fly?" takes an expert approach to breaking down the science ... Show More
56m 52s
Jan 2025
The AI revolution is running out of data. What can researchers do?
The explosive improvement in artificial intelligence (AI) technology has largely been driven by making neural networks bigger and training them on more data. But experts suggest that the developers of these systems may soon run out of data to train their models. As a result, team ... Show More
16m 31s
Jan 2025
Exploring the Past, Present, and Future of AI/ML
In this episode of the Data Science Salon Podcast, host Anna Anisin sits down with two influential leaders in AI and data science to discuss their experiences, challenges, and insights into the evolving landscape of the industry. First, Fatma Tarlaci, Chief Technology Officer at ... Show More
40m 47s
Dec 2020
429: 2020's Biggest Data Science Breakthroughs
Jon Krohn joins us for a year-end episode about 2020’s biggest data science breakthroughs and for a big podcast announcement for 2021. In this episode you will learn: Global warming [4:37] Our big podcast announcement [6:57] Who is Jon Krohn? [12:14] Top 3 technological breakthro ... Show More
1h 31m
Mar 2021
Tracking the (Money) Balls: How Data Science is Becoming a Game Changer
Data science is huge in sports, and it's not just game stats anymore.  Player and ball tracking data are changing the way major sports leagues play games. We dive into how these data are analyzed and what the results mean to coaches and teams with Kirk Goldsberry, NBA analyst at ... Show More
28m 27s