logo
episode-header-image
Jul 2020
21m 10s

What data transformation library should ...

FRANCESCO GADALETA
About this episode
In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer perfo ... Show More
Up next
Apr 21
Europe, wake up! You Can't Be a Superpower on Someone Else's Servers (Ep. 304)
Rebuilding a defense industrial base takes 20 years and costs trillions. Tech sovereignty takes 3 years and political will. Europe is doing the hard thing and refusing the easy one. Here's why — and who's profiting from that refusal. Buy me a coffee https://ko-fi.com/datascience ... Show More
28m 11s
Apr 7
Social media is an ant mill (Internet is a disaster) (Ep. 303)
Internet followed nature. Until it didn't. And became the disaster we know. Buy me a coffee https://ko-fi.com/datascience ✨ Connect with us! Personal newsletter: https://defragzone.substack.com 📩 Newsletter: https://datascienceathome.substack.com 🎙 Podcast: Available on Spotify ... Show More
33m 39s
Apr 2
About Apple's Privacy (Ep. 302)
Paragon's spyware hacked fully updated iPhones without a single click. Apple just spent $2B on tech that reads your silent speech. And your iCloud? Governments can request it with paperwork. This is what the privacy brand actually buys you. Buy me a coffee https://ko-fi.com/datas ... Show More
45m 40s
Recommended Episodes
Sep 2021
Massively Parallel Data Processing In Python Without The Effort Using Bodo
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Python has beome the de facto language for working with data. That has brought with it a number of challenges having to do with the speed and scalability of working with large volumes of information.There have been many ... Show More
1h 4m
Mar 2024
#454: Data Pipelines with Dagster
See the full show notes for this episode on the website at <a href="https://talkpython.fm/454">talkpython.fm/454</a> 
58m 25s
Nov 2022
Analyze Massive Data At Interactive Speeds With The Power Of Bitmaps Using FeatureBase
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>The most expensive part of working with massive data sets is the work of retrieving and processing the files that contain the raw information. FeatureBase (formerly Pilosa) avoids that overhead by converting the data int ... Show More
59m 25s
Dec 2019
Building The Materialize Engine For Interactive Streaming Analytics In SQL
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Transactional databases used in applications are optimized for fast reads and writes with relatively simple queries on a small number of records. Data warehouses are optimized for batched writes and complex analytical qu ... Show More
48m 7s
Jul 2023
#422: How data scientists use Python
See the full show notes for this episode on the website at <a href="https://talkpython.fm/422">talkpython.fm/422</a> 
1h 2m
Sep 2022
#382: Apache Superset: Modern Data Exploration Platform
See the full show notes for this episode on the website at <a href="https://talkpython.fm/382">talkpython.fm/382</a> 
1h 8m
Nov 2021
Exploring Processing Patterns For Streaming Data Integration In Your Data Lake
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near r ... Show More
52m 53s
Aug 2023
#425: Memray: The endgame Python memory profiler
See the full show notes for this episode on the website at <a href="https://talkpython.fm/425">talkpython.fm/425</a> 
1h 10m
Oct 2023
Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable
<h2>Summary</h2> <p>Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminat ... Show More
1h 8m