logo
episode-header-image
Aug 2024
1h 8m

#474: Python Performance for Data Scienc...

MICHAEL KENNEDY
About this episode
Python performance has come a long way in recent times. And it's often the data scientists, with their computational algorithms and large quantities of data, who care the most about this form of performance. It's great to have Stan Seibert back on the show to talk about Python's performance for data scientists. We cover a wide range of tools and techniques that will be valuable for many Python developers and data scientists.

Episode sponsors

Posit
Talk Python Courses

Links from the show

Stan on Twitter: @seibert
Anaconda: anaconda.com
High Performance Python with Numba training: learning.anaconda.cloud
PEP 0703: peps.python.org
Python 3.13 gets a JIT: tonybaloney.github.io
Numba: numba.pydata.org
LanceDB: lancedb.com
Profiling tips: docs.python.org
Memray: github.com
Fil: a Python memory profiler for data scientists and scientists: pythonspeed.com
Rust: rust-lang.org
Granian Server: github.com
PIXIE at SciPy 2024: github.com
Free threading Progress: py-free-threading.github.io
Free Threading Compatibility: py-free-threading.github.io
caniuse.com: caniuse.com
SPy, presented at PyCon 2024: us.pycon.org
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to us on YouTube: youtube.com
Follow Talk Python on Mastodon: talkpython
Follow Michael on Mastodon: mkennedy
Up next
Oct 6
#522: Data Sci Tips and Tricks from CodeCut.ai
Today we’re turning tiny tips into big wins. Khuyen Tran, creator of CodeCut.ai, has shipped hundreds of bite-size Python and data science snippets across four years. We dig into open-source tools you can use right now, cleaner workflows, and why notebooks and scripts don’t have ... Show More
1h 9m
Sep 29
#521: Red Teaming LLMs and GenAI with PyRIT
English is now an API. Our apps read untrusted text; they follow instructions hidden in plain sight, and sometimes they turn that text into action. If you connect a model to tools or let it read documents from the wild, you have created a brand new attack surface. In this episode ... Show More
1h 2m
Sep 23
#520: pyx - the other side of the uv coin (announcing pyx)
A couple years ago, Charlie Marsh lit a fire under Python tooling with Ruff and then uv. Today he’s back with something on the other side of that coin: pyx. Pyx isn’t a PyPI replacement. Think server, not just index. It mirrors PyPI, plays fine with pip or uv, and aims to make in ... Show More
1 h
Recommended Episodes
Aug 5
911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal
Reproducibility, Python notebooks, and data science communities: Software developer Akshay Agrawal speaks to Jon Krohn about Marimo, the next-generation computational notebook for Python, how he built and fostered a thriving community around the product, and what makes this noteb ... Show More
58m 20s
Sep 9
What's New at CFI | Data Analysis in Python
Ready to take your data analysis skills to the next level? In this episode of What's New at CFI, we chat with subject matter expert Joseph Yeates about his newest course, Data Analysis in Python. This course is the perfect follow-up to our "Getting Started with Python" series and ... Show More
13m 33s
Mar 2025
NVIDIA RAPIDS and Open Source ML Acceleration with Chris Deotte and Jean-Francois Puget
NVIDIA RAPIDS is an open-source suite of GPU-accelerated data science and AI libraries. It leverages CUDA and significantly enhances the performance of core Python frameworks including Polars, pandas, scikit-learn and NetworkX. Chris Deotte is a Senior Data Scientist at NVIDIA an ... Show More
42m 6s
Feb 2022
Modern Code Generation with Jordan Adler
Jordan Adler is Head of Developer Engineering at OneSignal and has a deep interest in code generation. He has helped migrate large systems from Python 2 or Python 3 using code generation and code transformation. Using tools like Yellicode, Python Future, and others, Jordan's team ... Show More
34m 49s
Aug 2024
Launching the Fastest AI Inference Solution with Cerebras Systems CEO Andrew Feldman
In this episode of Gradient Dissent, Andrew Feldman, CEO of Cerebras Systems, joins host Lukas Biewald to discuss the latest advancements in AI inference technology. They explore Cerebras Systems' groundbreaking new AI inference product, examining how their wafer-scale chips are ... Show More
53m 14s
Jun 2023
AI trends: a Latent Space crossover
Daniel had the chance to sit down with @swyx and Alessio from the Latent Space pod in SF to talk about current AI trends and to highlight some key learnings from past episodes. The discussion covers open access LLMs, smol models, model controls, prompt engineering, and LLMOps. Th ... Show More
59m 39s
May 2018
MLA 002 Numpy & Pandas
NumPy enables efficient storage and vectorized computation on large numerical datasets in RAM by leveraging contiguous memory allocation and low-level C/Fortran libraries, drastically reducing memory footprint compared to native Python lists. Pandas, built on top of NumPy, introd ... Show More
18m 10s
Feb 2025
MATLAB vs. Python vs. Julia: The Hidden Truths - Gareth Thomas | Podcast #147
🌎 More about Versionbay: https://www.versionbay.com/Connect with Gareth on LinkedIn: https://www.linkedin.com/in/g-thomas/In this episode, we sit down with Gareth Thomas, founder of VersionBay, to explore the critical role of software versioning in engineering and how companies ... Show More
32m 57s
Dec 2024
How Diamond Cooling Could Power the Future of AI, with Akash Systems
In this episode of No Priors, Sarah sits down with Felix Ejeckam and Ty Mitchell, founders of Akash Systems, a company pioneering diamond-based cooling technology for semiconductors used in space applications and large-scale AI data centers. Felix and Ty discuss how their backgro ... Show More
42m 21s
Mar 2017
MLG 010 Languages & Frameworks
Try a walking desk while studying ML or working on your projects! Languages & frameworks comparison. Languages: Python, R, MATLAB/Octave, Julia, Java/Scala, C/C++. Frameworks: Hadoop/Spark, Deeplearning4J, Theano, Torch, TensorFlow. ocdevel.com/mlg/10 for notes and resources 
44m 36s