logo
episode-header-image
Nov 2024
42m 24s

Github Collaboration Network

Kyle Polich
About this episode

In this episode we discuss the GitHub Collaboration Network with Behnaz Moradi-Jamei, assistant professor at James Madison University.  As a network scientist, Behnaz created and analyzed a network of about 700,000 contributors to Github's repository.  The network of collaborators on GitHub was created by identifying developers (nodes) and linking them with edges based on shared contributions to the same repositories. This means that if two developers contributed to the same project, an edge (connection) was formed between them, representing a collaborative relationship network consisting of 32 million such connections.
By using algorithms for Community Detection, Behnaz's analysis reveals insights into how developer communities form, function, and evolve, that can be used as guidance for OSS community managers.

Up next
Yesterday
Designing Recommender Systems for Digital Humanities
<p>In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on <a href= "http://monasterium.net/">Monasteriu ... Show More
36m 48s
Nov 13
DataRec Library for Reproducible in Recommend Systems
<p>In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich explores DataRec, a new Python library designed to bring reproducibility and standardization to recommender systems research. Guest Alberto Carlo Maria Mancino, a postdoc researcher from Politecnico ... Show More
32m 48s
Nov 5
Shilling Attacks on Recommender Systems
In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where mal ... Show More
34m 48s
Recommended Episodes
Sep 2024
Stack Overflow Signs Deal with OpenAI to Sell User Data
<p>In this episode, we explore the recent partnership between Stack Overflow and OpenAI, detailing how Stack Overflow's vast repository of developer insights and coding solutions will be utilized to enhance OpenAI's models. We'll dive into the implications of this collaboration f ... Show More
6m 12s
Oct 15
Inside the Linux Foundation's Open-Source Movement
Daniela Barbosa, General Manager of Decentralized Technologies at the Linux Foundation, and Executive Director at LF Decentralized Trust, discusses the most promising open-source projects they've supported so far, and how more builders can get involved. She also emphasizes the im ... Show More
24m 34s
Sep 2017
TBP156 - Combined Forces for Better Results
The Sweetbridge Foundation, a non-profit aiming to leverage blockchain technology to power the next generation of global supply chain networks, announced that blockchain expert Vinay Gupta joined its Advisory Group. Drawing upon his decades of experience in the cryptocurrency, te ... Show More
1h 31m
Jan 2025
AI Developer Tools at Google with Paige Bailey
<p>Over the years, Google has released a variety of ML, data science, and AI developer tools and platforms. Prominent examples include Colab, Kaggle, AI Studio, and the Gemini API. Paige Bailey is the Uber Technical Lead of the Developer Relations team at Google ML Developer Tool ... Show More
37m 28s
Nov 2024
scikit-learn & data science you own
<p>We are at GenAI saturation, so let’s talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the “AI” that is ... Show More
52m 2s
Nov 2024
Building an AI creator community w/ Civitai founders Justin Maier and Maxfield Hulker
Ever since generative AI tools like Midjourney became available to the public in 2022, curious users and AI fanatics alike have been experimenting with the technology. But for tech aficionados and AI enthusiasts like Justin Maier and Maxfield Hulker, Midjourney’s closed-source mo ... Show More
49m 45s
Aug 2023
E28 - libgit2, source code management and edge computing
Edward Thomson is a fantastic software engineer, he’s currently the maintainer of libgit2: a pure C implementation of the git core methods provided as a linkable library with a solid API. libgit2 powers GitHub, GitLab, Azure DevOps and many other products. Ed is also a product ma ... Show More
1h 9m