logo
episode-header-image
Aug 2024
1h 8m

#474: Python Performance for Data Scienc...

MICHAEL KENNEDY
About this episode
Python performance has come a long way in recent times. And it's often the data scientists, with their computational algorithms and large quantities of data, who care the most about this form of performance. It's great to have Stan Seibert back on the show to talk about Python's performance for data scientists. We cover a wide range of tools and techniques that will be valuable for many Python developers and data scientists.

Episode sponsors

Posit
Talk Python Courses

Links from the show

Stan on Twitter: @seibert
Anaconda: anaconda.com
High Performance Python with Numba training: learning.anaconda.cloud
PEP 0703: peps.python.org
Python 3.13 gets a JIT: tonybaloney.github.io
Numba: numba.pydata.org
LanceDB: lancedb.com
Profiling tips: docs.python.org
Memray: github.com
Fil: a Python memory profiler for data scientists and scientists: pythonspeed.com
Rust: rust-lang.org
Granian Server: github.com
PIXIE at SciPy 2024: github.com
Free threading Progress: py-free-threading.github.io
Free Threading Compatibility: py-free-threading.github.io
caniuse.com: caniuse.com
SPy, presented at PyCon 2024: us.pycon.org
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to us on YouTube: youtube.com
Follow Talk Python on Mastodon: talkpython
Follow Michael on Mastodon: mkennedy
Up next
May 5
#504: Developer Trends in 2025
What trends and technologies should you be paying attention to today? Are there hot new database servers you should check out? Or will that just be a flash in the pan? I love these forward looking episodes and this one is super fun. I've put together an amazing panel: Gina Häußge ... Show More
1h 9m
Apr 28
#503: The PyArrow Revolution
Pandas is at a the core of virtually all data science done in Python, that is virtually all data science. Since it's beginning, Pandas has been based upon numpy. But changes are afoot to update those internals and you can now optionally use PyArrow. PyArrow comes with a ton of be ... Show More
1h 8m
Apr 21
#502: Django Ledger: Accounting with Python
Do you or your company need accounting software? Well, there are plenty of SaaS products out there that you can give your data to. but maybe you also really like Django and would rather have a foundation to build your own accounting system exactly as you need for your company or ... Show More
1h 3m
Recommended Episodes
May 2023
675: Pandas for Data Analysis and Visualization
Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas.This episode is brought to you by Posit, the open-source data science ... Show More
1h 8m
Mar 2025
NVIDIA RAPIDS and Open Source ML Acceleration with Chris Deotte and Jean-Francois Puget
NVIDIA RAPIDS is an open-source suite of GPU-accelerated data science and AI libraries. It leverages CUDA and significantly enhances the performance of core Python frameworks including Polars, pandas, scikit-learn and NetworkX. Chris Deotte is a Senior Data Scientist at NVIDIA an ... Show More
42m 6s
Jun 2024
SE Radio 622: Wolf Vollprecht on Python Tooling in Rust
Wolf Vollprecht, the CEO and founder of Prefix.dev, speaks with host Gregory M. Kapfhammer about how to implement Python tools, such as package managers, in the Rust programming language. They discuss the challenges associated with building Python infrastructure tooling in Python ... Show More
55m 10s
Jul 2019
Episode 67: Classic Computer Science Problems in Python
Today I am with David Kopec, author of Classic Computer Science Problems in Python, published by Manning Publications. His book deepens your knowledge of problem solving techniques from the realm of computer science by challenging you with interesting and realistic scenarios, exe ... Show More
28m 35s
Jul 2024
803: How to Thrive in Your (Data Science) Career, with Daliana Liu
Daliana Liu is a big name in data science teaching, and she has always been generous in sharing everything she knows about getting a job in data science. In this episode, she continues to extend her generosity, helping listeners define their approach to achieving a fulfilling car ... Show More
1h 54m
Sep 2024
819: PyTorch: From Zero to Hero, with Luka Anicin
SuperDataScience veteran and Udemy teacher Luka Anicin is on the podcast to talk about his brand-new course, “PyTorch: From Zero to Hero”, available exclusively on superdatascience.com. Host Jon Krohn asks Luka why he feels that every data scientist should consider PyTorch as the ... Show More
1h 6m
Dec 2024
849: 2025 AI and Data Science Predictions, with Sadie St. Lawrence
Sadie St Lawrence returns for her 4th annual prediction episode on the Super Data Science Podcast. Together with host Jon Krohn, they reflect on 2024’s most transformative trends—like agentic AI and enterprise AI monetization—and predict what's coming in 2025, from AI-driven scie ... Show More
1h 18m
Feb 2022
Modern Code Generation with Jordan Adler
Jordan Adler is Head of Developer Engineering at OneSignal and has a deep interest in code generation. He has helped migrate large systems from Python 2 or Python 3 using code generation and code transformation. Using tools like Yellicode, Python Future, and others, Jordan's team ... Show More
34m 49s
Jun 2020
Rust and machine learning #4: practical tools (Ep. 110)
In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly. To make a comparison with the Python ecosystem I wi ... Show More
24m 18s
Dec 2021
531: Data Science at the Command Line
Jeroen Janssens joins on the podcast to discuss his book on utilizing the command line for data science and the importance of polyglot data science work. In this episode you will learn: The genesis of Jeroen’s book [3:24] Data Science at the Command Line [8:55] Creating your own ... Show More
50m 30s