logo
episode-header-image
Jan 2025
37m 23s

Fraud Detection with Graphs

Kyle Polich
About this episode

In this episode, Šimon Mandlík, a PhD candidate at the Czech Technical University will talk with us about leveraging machine learning and graph-based techniques for cybersecurity applications.

We'll learn how graphs are used to detect malicious activity in networks, such as identifying harmful domains and executable files by analyzing their relationships within vast datasets.

This will include the use of hierarchical multi-instance learning (HML) to represent JSON-based network activity as graphs and the advantages of analyzing connections between entities (like clients, domains etc.).

Our guest shows that while other graph methods (such as GNN or Label Propagation) lack in scalability or having trouble with heterogeneous graphs, his method can tackle them because of the "locality assumption" – fraud will be a local phenomenon in the graph – and by relying on this assumption, we can get faster and more accurate results.

-------------------------------

Want to listen ad-free?  Try our Graphs Course?  Join Data Skeptic+ for $5 / month of $50 / year

https://plus.dataskeptic.com

Up next
Jul 6
The Network Diversion Problem
In this episode, Professor Pål Grønås Drange from the University of Bergen, introduces the field of Parameterized Complexity - a powerful framework for tackling hard computational problems by focusing on specific structural aspects of the input. This framework allows researchers ... Show More
46m 14s
Jun 28
Complex Dynamic in Networks
In this episode, we learn why simply analyzing the structure of a network is not enough, and how the dynamics - the actual mechanisms of interaction between components - can drastically change how information or influence spreads. Our guest, Professor Baruch Barzel of Bar-Ilan Un ... Show More
56 m
Jun 22
Github Network Analysis
In this episode we'll discuss how to use Github data as a network to extract insights about teamwork. Our guest, Gabriel Ramirez, manager of the notifications team at GitHub, will show how to apply network analysis to better understand and improve collaboration within his enginee ... Show More
36m 46s
Recommended Episodes
Sep 2024
Data for Dummies: A Crash Course for Non-Technical PMs (with Mo Hallaba)
In today's data-driven landscape, organizations often find themselves drowning in a sea of data, yet struggling to glean actionable insights from it. Many companies are eager to label themselves as data-centric, but the reality is that not everyone is equally adept at interp ... Show More
23m 23s
Feb 2025
How Can GenAI Make Analytics More Accessible to Product Teams? (with Mario Ciabarra)
Whether you prefer the term data-driven, or data-informed, or data-dazzled, it doesn't matter—today's tech cannot survive without high quality data sets AND the tools to use them effectively. But we also can't afford to think about data as the responsibility of jus ... Show More
27m 46s
Apr 2023
2344: Cloudera: Moving Beyond Big Data to Hybrid Data Mastery
I sit down with Chris Royles, EMEA Field CTO at Cloudera, to discuss the evolution of Big Data and why hybrid data is the next challenge for businesses to tackle. In this episode, we explore how the term 'Big Data' has become dated and how the rapid rise of hybrid data has shifte ... Show More
39m 54s
Apr 2015
Starting Simple and Machine Learning in Meds
In episode nine we talk with George Dahl, of  the University of Toronto, about his work on the Merck molecular activity challenge on kaggle and speech recognition. George recently successfully defended his thesis at the end of March 2015. (Congrats George!) We learn about how net ... Show More
38m 24s
Sep 2024
821: The Skills You Need to Be an Effective Data Scientist, with Marck Vaisman
Marck Vaisman speaks to Jon Krohn about his paradigm for understanding core data practitioner types. Hear Marck detail the four data practitioner personas that he has identified in his research, why he believes the roadmaps that influencers like to promote as surefire ways to a d ... Show More
1h 13m
Nov 2024
Mastering Algorithms: From Binary Search Trees to Dynamic Programming and Greedy Strategies
In this episode, we explore foundational algorithms and data structures that every developer and computer science enthusiast should know. Covering everything from Binary Search Trees (BSTs) to advanced concepts like Dynamic Programming and Greedy Algorithms, this episode is packe ... Show More
28m 3s
Nov 2024
#262 Self-Service Business Intelligence with Sameer Al-Sakran, CEO at Metabase
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here.We’re often caught chasing the dream of “self-serve” data—a place where data empowers stakeholders to answer th ... Show More
51m 33s
Jul 2022
IoT, IIoT and Managing Edge Data
Brian Gilmore (@BrianMGilmore, Director IoT/Emerging Technology @InfluxDB) talks about Edge and Industrial Edge Computing, as well as application and data challenges at the edge.SHOW: 634CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST ... Show More
35m 37s
Jan 2024
Matthew O. Jackson, "The Human Network: How Your Social Position Determines Your Power, Beliefs, and Behaviors" (Vintage, 2019)
Social networks existed and shaped our lives long before Silicon Valley startups made them virtual. For over two decades economist Matthew O. Jackson, a professor at Stanford University, has studied how the shape of networks and our positions within them can affect us. In this in ... Show More
1h 6m
Jan 2024
Matthew O. Jackson, "The Human Network: How Your Social Position Determines Your Power, Beliefs, and Behaviors" (Vintage, 2019)
Social networks existed and shaped our lives long before Silicon Valley startups made them virtual. For over two decades economist Matthew O. Jackson, a professor at Stanford University, has studied how the shape of networks and our positions within them can affect us. In this in ... Show More
1h 6m