logo
episode-header-image
May 2023
1h 8m

675: Pandas for Data Analysis and Visual...

Jon Krohn
About this episode

Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas.

This episode is brought to you by Posit, the open-source data science company, and by AWS Inferentia. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

In this episode you will learn:
• The advantages of using pandas over other libraries [07:55]
• Why data wrangling in pandas is so helpful [12:05]
• Stefanie’s Data Morph library [24:27]
• When to use pandas, matplotlib, or seaborn [33:45]
• Understanding the ticker module in matplotlib [36:48]
• Where data analysts should start their learning journey [40:08]
• What it’s like being a software engineer at Bloomberg [51:19]

Additional materials: www.superdatascience.com/675

Up next
Aug 22
916: The 5 Key GPT-5 Takeaways
GPT-5 has just been released, but with not very much fanfare. In this Five-Minute Friday, Jon Krohn asks if GPT-5 deserves the community’s underwhelmed response to its release. He outlines five features of the model and explains why people might be feeling less than enthusiastic ... Show More
9m 40s
Aug 19
915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi
Tech leader, investor, and Generationship cofounder Michelle Yi talks to Jon Krohn about finding ways to trust and secure AI systems, the methods that hackers use to jailbreak code, and what users can do to build their own trustworthy AI systems. Learn all about “red teaming” and ... Show More
1h 9m
Aug 15
914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz
In this Five-Minute Friday, Cofounder and CTO of lakeFS Oz Katz talks to Jon Krohn about data warehouses, data lakes, and how companies can handle increasingly complex data infrastructures and formats. Hear about lakeFS’s collaboration with Legofest, lakeFS’s approach to helping ... Show More
25m 52s
Recommended Episodes
Aug 2024
#474: Python Performance for Data Science
Python performance has come a long way in recent times. And it's often the data scientists, with their computational algorithms and large quantities of data, who care the most about this form of performance. It's great to have Stan Seibert back on the show to talk about Python's ... Show More
1h 8m
Feb 2025
#495: OSMnx: Python and OpenStreetMap
On this episode, I’m joined by Dr. Jeff Boeing, an assistant professor at the University of Southern California whose research spans urban planning, spatial analysis, and data science. We explore why OpenStreetMap is such a powerful source of global map data—and how Jeff’s Python ... Show More
1h 1m
Jul 2024
#471: Learning and teaching Pandas
If you want to get better at something, often times the path is pretty clear. If you get better at swimming, you go to the pool and practice your strokes and put in time doing the laps. If you want to get better at mountain biking, hit the trails and work on drills focusing on di ... Show More
1h 4m
Dec 2024
#489: Anaconda Toolbox for Excel and more with Peter Wang
Peter Wang has been pushing Python forward since the early days of its data science roots. We're lucky to have him back on the show. We're going to talk about the Anaconda Toolbox for Excel as well as many other trends and topics that are hot in the Python space right now. I'm su ... Show More
1h 9m
Dec 2024
#491: DuckDB and Python: Ducks and Snakes living together
Join me for an insightful conversation with Alex Monahan, who works on documentation, tutorials, and training at DuckDB Labs. We explore why DuckDB is gaining momentum among Python and data enthusiasts, from its in-process database design to its blazingly fast, columnar architect ... Show More
1h 2m
Jul 2024
120: Don’t Learn Python as a Data Analyst (Learn This Instead)
Although Python is talked about a lot in the data world, if you are aiming for your first data analyst role, I don’t think you should learn it. It takes too much time, it’s hard to learn, and it’s hard to use. In this episode, I’ll dive into more of the specifics and what to focu ... Show More
9m 1s
Jul 28
Revolutionizing Python Notebooks with Marimo
SummaryIn this episode of the Data Engineering Podcast Akshay Agrawal from Marimo discusses the innovative new Python notebook environment, which offers a reactive execution model, full Python integration, and built-in UI elements to enhance the interactive computing experience. ... Show More
51m 56s
Jun 2020
Rust and machine learning #4: practical tools (Ep. 110)
In this episode I make a non exhaustive list of machine learning tools and frameworks, written in Rust. Not all of them are mature enough for production environments. I believe that community effort can change this very quickly. To make a comparison with the Python ecosystem I wi ... Show More
24m 18s
Mar 2025
#497: Outlier Detection with Python
Have you ever wondered why certain data points stand out so dramatically? They might hold the key to everything from fraud detection to groundbreaking discoveries. This week on Talk Python to Me, we dive into the world of outlier detection with Python with Brett Kennedy. You’ll l ... Show More
55m 22s
Oct 2024
#483: Reflex Framework: Frontend, Backend, Pure Python
Let's say you want to create a web app and you know Python really well. Your first thought might be Flask or Django or even FastAPI? All good choices but there is a lot to get a full web app into production. The framework we'll talk about today, Reflex, allows you to just write P ... Show More
1h 3m