logo
episode-header-image
Jun 2025
36m 46s

Github Network Analysis

Kyle Polich
About this episode

In this episode we'll discuss how to use Github data as a network to extract insights about teamwork.

Our guest, Gabriel Ramirez, manager of the notifications team at GitHub, will show how to apply network analysis to better understand and improve collaboration within his engineering team by analyzing GitHub metadata - such as pull requests, issues, and discussions - as a bipartite graph of people and projects.

Some insights we'll discuss are how network centrality measures (like eigenvector and betweenness centrality) reveal organizational dynamics, how vacation patterns influence team connectivity, and how decentralizing communication hubs can foster healthier collaboration. 

Gabriel's open-source project, GH Graph Explorer, enables other managers and engineers to extract, visualize, and analyze their own GitHub activity using tools like Python, Neo4j, Gephi and LLMs for insight generation, but always remember – don't take the results on face value. Instead, use the results to guide your qualitative investigation. 

Up next
Today
Student Spotlight: Aaron Payne, Data Analyst
Aaron Payne, an MBA student at Georgia Tech studying business analytics and a Senior Insights Analyst at Chick-fil-A, joins Kyle Polich to talk about turning analytics into decisions that matter. They unpack a real-world forecasting project with Comfama in Colombia, including mes ... Show More
25m 59s
Apr 25
The Future is Agentic in Recommender Systems
Kyle Polich sits down with Yashar Deldjoo, research scientist and Associate Professor at the Polytechnic University of Bari, to explore how recommender systems have evolved and why trustworthiness matters. They unpack key dimensions of responsible AI, including robustness to adve ... Show More
49m 25s
Mar 27
Book Ratings and Recommendations
Goodreads star ratings can be misleading as measures of "book quality," and research from Hannes Rosenbusch suggests that for many professionally published books, differences between readers often matter more than differences between books. The episode also explores how to model ... Show More
39m 19s
Recommended Episodes
Oct 2024
Big data is dead, analytics is alive
We are on the other side of “big data” hype, but what is the future of analytics and how does AI fit in? Till and Adithya from MotherDuck join us to discuss why DuckDB is taking the analytics and AI world by storm. We dive into what makes DuckDB, a free, in-process SQL OLAP datab ... Show More
50m 19s
Sep 2024
AI Agents for Data Analysis with Shreya Shankar - #703
Today, we're joined by Shreya Shankar, a PhD student at UC Berkeley to discuss DocETL, a declarative system for building and optimizing LLM-powered data processing pipelines for large-scale and complex document analysis tasks. We explore how DocETL's optimizer architecture works, ... Show More
48m 24s
Nov 2024
scikit-learn & data science you own
We are at GenAI saturation, so let’s talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the “AI” that is ac ... Show More
52m 2s
Jul 2024
#225 The Full Stack Data Scientist with Savin Goyal, Co-Founder & CTO at Outerbounds
The role of the data scientist is changing. Some organizations are splitting the role into more narrowly focused jobs, while others are broadening it. The latter approach, known as the Full Stack Data Scientist, is derived from the concept of a full stack software engineer, with ... Show More
48m 44s
Aug 2025
Amperity Reimagines Data and Developer Workflows with AI - Ep. 271
Derek Slager, co-founder and CTO of Amperity, explores how agentic AI and vibe coding are reshaping enterprise data management and the developer experience on the NVIDIA AI Podcast. Hear how Amperity’s platform unifies customer data, powers advanced analytics, and brings conversa ... Show More
36m 40s
Aug 2024
Episode 201 - Introduction to KitOps for MLOps
<p>Join Allen and Mark in this episode of Two Voice Devs as they dive into the world of MLOps and explore KitOps, an open-source tool for packaging and versioning machine learning models and related artifacts. Learn how KitOps leverages the Open Container Initiative (OCI) standar ... Show More
33m 59s
Jan 2025
Exploring the Past, Present, and Future of AI/ML
In this episode of the Data Science Salon Podcast, host Anna Anisin sits down with two influential leaders in AI and data science to discuss their experiences, challenges, and insights into the evolving landscape of the industry. First, Fatma Tarlaci, Chief Technology Officer at ... Show More
40m 47s
Sep 2024
Data-Driven Excellence: AI and Analytics in Action with Matthew Denesuk & Jaime Russ
In this DSS Podcast we chat with Matthew Denesuk, SVP of Data Analytics & AI at Royal Caribbean Group. Matthew shares his insights on leveraging a Center of Excellence model to drive data-driven strategies across the organization. Tune in to discover how this approach can transfo ... Show More
32m 41s
Oct 2025
Evals, error analysis, and better prompts: A systematic approach to improving your AI products | Hamel Husain (ML engineer)
Hamel Husain, an AI consultant and educator, shares his systematic approach to improving AI product quality through error analysis, evaluation frameworks, and prompt engineering. In this episode, he demonstrates how product teams can move beyond “vibe checking” their AI systems t ... Show More
54m 48s