logo
episode-header-image
Jun 2025
36m 46s

Github Network Analysis

Kyle Polich
About this episode

In this episode we'll discuss how to use Github data as a network to extract insights about teamwork.

Our guest, Gabriel Ramirez, manager of the notifications team at GitHub, will show how to apply network analysis to better understand and improve collaboration within his engineering team by analyzing GitHub metadata - such as pull requests, issues, and discussions - as a bipartite graph of people and projects.

Some insights we'll discuss are how network centrality measures (like eigenvector and betweenness centrality) reveal organizational dynamics, how vacation patterns influence team connectivity, and how decentralizing communication hubs can foster healthier collaboration. 

Gabriel’s open-source project, GH Graph Explorer, enables other managers and engineers to extract, visualize, and analyze their own GitHub activity using tools like Python, Neo4j, Gephi and LLMs for insight generation, but always remember – don't take the results on face value. Instead, use the results to guide your qualitative investigation. 

Up next
Yesterday
Sustainable Recommender Systems for Tourism
In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable to ... Show More
38m 2s
Sep 22
Interpretable Real Estate Recommendations
In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich interviews Dr. Kunal Mukherjee, a postdoctoral research associate at Virginia Tech, about the paper "Z-REx: Human-Interpretable GNN Explanations for Real Estate Recommendations" The discussion explores ... Show More
32m 57s
Sep 8
Why Am I Seeing This?
In this episode of Data Skeptic, we explore the challenges of studying social media recommender systems when exposure data isn't accessible. Our guests Sabrina Guidotti, Gregor Donabauer, and Dimitri Ognibene introduce their innovative "recommender neutral user model" for inferri ... Show More
49m 36s
Recommended Episodes
Oct 2024
Big data is dead, analytics is alive
We are on the other side of “big data” hype, but what is the future of analytics and how does AI fit in? Till and Adithya from MotherDuck join us to discuss why DuckDB is taking the analytics and AI world by storm. We dive into what makes DuckDB, a free, in-process SQL OLAP datab ... Show More
50m 19s
Sep 2024
AI Agents for Data Analysis with Shreya Shankar - #703
Today, we're joined by Shreya Shankar, a PhD student at UC Berkeley to discuss DocETL, a declarative system for building and optimizing LLM-powered data processing pipelines for large-scale and complex document analysis tasks. We explore how DocETL's optimizer architecture works, ... Show More
48m 24s
Nov 2024
scikit-learn & data science you own
We are at GenAI saturation, so let’s talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the “AI” that is ac ... Show More
52m 2s
Jul 2024
#225 The Full Stack Data Scientist with Savin Goyal, Co-Founder & CTO at Outerbounds
The role of the data scientist is changing. Some organizations are splitting the role into more narrowly focused jobs, while others are broadening it. The latter approach, known as the Full Stack Data Scientist, is derived from the concept of a full stack software engineer, with ... Show More
48m 44s
Aug 27
Amperity Reimagines Data and Developer Workflows with AI - Ep. 271
Derek Slager, co-founder and CTO of Amperity, explores how agentic AI and vibe coding are reshaping enterprise data management and the developer experience on the NVIDIA AI Podcast. Hear how Amperity’s platform unifies customer data, powers advanced analytics, and brings conversa ... Show More
36m 40s
Aug 2024
Episode 201 - Introduction to KitOps for MLOps
Join Allen and Mark in this episode of Two Voice Devs as they dive into the world of MLOps and explore KitOps, an open-source tool for packaging and versioning machine learning models and related artifacts. Learn how KitOps leverages the Open Container Initiative (OCI) standard t ... Show More
33m 59s
Jan 2025
Exploring the Past, Present, and Future of AI/ML
In this episode of the Data Science Salon Podcast, host Anna Anisin sits down with two influential leaders in AI and data science to discuss their experiences, challenges, and insights into the evolving landscape of the industry. First, Fatma Tarlaci, Chief Technology Officer at ... Show More
40m 47s
Sep 2024
Data-Driven Excellence: AI and Analytics in Action with Matthew Denesuk & Jaime Russ
In this DSS Podcast we chat with Matthew Denesuk, SVP of Data Analytics & AI at Royal Caribbean Group. Matthew shares his insights on leveraging a Center of Excellence model to drive data-driven strategies across the organization. Tune in to discover how this approach can transfo ... Show More
32m 41s
Sep 2024
AI Data Analytics at Google with Gerrit Kazmaier
Google needs no introduction, and is renowned for its data and analytics capabilities. Gerrit Kazmaier is the VP and GM for Database, Data Analytics and Looker at Google. He has a long history in the space, and in this episode he speaks with Sean Falconer about data and analytics ... Show More
49m 5s
Sep 2024
Building a Collaborative DevOps Platform with Adam Jacob
DevOps is a powerful model for managing the building and operational aspects of modern applications. Most developers are now familiar with DevOps, and the adoption of DevOps practices is widespread and growing. Adam Jacob was the original author of Chef, a popular early DevOps to ... Show More
53m 14s