logo
episode-header-image
Dec 2024
1h 2m

#491: DuckDB and Python: Ducks and Snake...

MICHAEL KENNEDY
About this episode
Join me for an insightful conversation with Alex Monahan, who works on documentation, tutorials, and training at DuckDB Labs. We explore why DuckDB is gaining momentum among Python and data enthusiasts, from its in-process database design to its blazingly fast, columnar architecture. We also dive into indexing strategies, concurrency considerations, and the fascinating way MotherDuck (the cloud companion to DuckDB) handles large-scale data seamlessly. Don’t miss this chance to learn how a single pip install could totally transform your Python data workflow!

Episode sponsors

Sentry Error Monitoring, Code TALKPYTHON
Data Citizens Podcast
Talk Python Courses

Links from the show

Alex on Mastodon: @__Alex__

DuckDB: duckdb.org
MotherDuck: motherduck.com
SQLite: sqlite.org
Moka-Py: github.com
PostgreSQL: www.postgresql.org
MySQL: www.mysql.com
Redis: redis.io
Apache Parquet: parquet.apache.org
Apache Arrow: arrow.apache.org
Pandas: pandas.pydata.org
Polars: pola.rs
Pyodide: pyodide.org
DB-API (PEP 249): peps.python.org/pep-0249
Flask: flask.palletsprojects.com
Gunicorn: gunicorn.org
MinIO: min.io
Amazon S3: aws.amazon.com/s3
Azure Blob Storage: azure.microsoft.com/products/storage
Google Cloud Storage: cloud.google.com/storage
DigitalOcean: www.digitalocean.com
Linode: www.linode.com
Hetzner: www.hetzner.com
BigQuery: cloud.google.com/bigquery
DBT (Data Build Tool): docs.getdbt.com
Mode: mode.com
Hex: hex.tech
Python: www.python.org
Node.js: nodejs.org
Rust: www.rust-lang.org
Go: go.dev
.NET: dotnet.microsoft.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Up next
Aug 22
#517: Agentic Al Programming with Python
Agentic AI programming is what happens when coding assistants stop acting like autocomplete and start collaborating on real work. In this episode, we cut through the hype and incentives to define “agentic,” then get hands-on with how tools like Cursor, Claude Code, and LangChain ... Show More
1h 17m
Aug 19
#516: Accelerating Python Data Science at NVIDIA
Python’s data stack is getting a serious GPU turbo boost. In this episode, Ben Zaitlen from NVIDIA joins us to unpack RAPIDS, the open source toolkit that lets pandas, scikit-learn, Spark, Polars, and even NetworkX execute on GPUs. We trace the project’s origin and why NVIDIA bui ... Show More
1h 5m
Aug 11
#515: Durable Python Execution with Temporal
What if your code was crash-proof? That's the value prop for a framework called Temporal. Temporal is a durable execution platform that enables developers to build scalable applications without sacrificing productivity or reliability. The Temporal server executes units of applica ... Show More
1h 10m
Recommended Episodes
Aug 5
911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal
Reproducibility, Python notebooks, and data science communities: Software developer Akshay Agrawal speaks to Jon Krohn about Marimo, the next-generation computational notebook for Python, how he built and fostered a thriving community around the product, and what makes this noteb ... Show More
58m 20s
Jul 28
Revolutionizing Python Notebooks with Marimo
SummaryIn this episode of the Data Engineering Podcast Akshay Agrawal from Marimo discusses the innovative new Python notebook environment, which offers a reactive execution model, full Python integration, and built-in UI elements to enhance the interactive computing experience. ... Show More
51m 56s
Jun 2023
AI trends: a Latent Space crossover
Daniel had the chance to sit down with @swyx and Alessio from the Latent Space pod in SF to talk about current AI trends and to highlight some key learnings from past episodes. The discussion covers open access LLMs, smol models, model controls, prompt engineering, and LLMOps. Th ... Show More
59m 39s
Nov 2024
scikit-learn & data science you own
We are at GenAI saturation, so let’s talk about scikit-learn, a long time favorite for data scientists building classifiers, time series analyzers, dimensionality reducers, and more! Scikit-learn is deployed across industry and driving a significant portion of the “AI” that is ac ... Show More
52m 2s
May 2023
675: Pandas for Data Analysis and Visualization
Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas.This episode is brought to you by Posit, the open-source data science ... Show More
1h 8m
Jun 2024
SE Radio 622: Wolf Vollprecht on Python Tooling in Rust
Wolf Vollprecht, the CEO and founder of Prefix.dev, speaks with host Gregory M. Kapfhammer about how to implement Python tools, such as package managers, in the Rust programming language. They discuss the challenges associated with building Python infrastructure tooling in Python ... Show More
55m 10s
Nov 2022
Kubernetes on Vessels, with Louis Bailleul
Louis Bailleul is a Chief Enterprise Architect at PGS. After years of running highly-ranked super computers to process PGS’ seismic data, Louis’s team at PGS has lead a transition to Google Cloud. Listen in to learn about HPC in Google Cloud with GKE, and to explore using Kuberne ... Show More
42m 56s
Sep 2024
Episode 205 - Gemini + LangGraph Agents + Google Sheets = Vodo Drive
Join us as we explore Vodo Drive, an innovative project that leverages Google's Gemini AI to revolutionize how we interact with spreadsheets. Creator Allen Firstenberg takes us behind the scenes, revealing the architecture, challenges, and breakthroughs of building an agentic ... Show More
51m 25s
Sep 2024
Pausing to think about scikit-learn & OpenAI o1
Recently the company stewarding the open source library scikit-learn announced their seed funding. Also, OpenAI released “o1” with new behavior in which it pauses to “think” about complex tasks. Chris and Daniel take some time to do their own thinking about o1 and the contrast to ... Show More
50m 10s
Apr 2025
#246 Will Granis: How Google Cloud is Powering the Future of Agentic AI
This episode is sponsored by Thuma. Thuma is a modern design company that specializes in timeless home essentials that are mindfully made with premium materials and intentional details. To get $100 towards your first bed purchase, go to http://thuma.co/eyeonai What happens when A ... Show More
57m 44s