logo
episode-header-image
Nov 2022
59m 25s

Analyze Massive Data At Interactive Spee...

Tobias Macey
About this episode
tail spinning
Up next
Jan 12
Semantic Operators Meet Dataframes: Building Context for Agents with FENIC
Summary In this episode Kostas Pardalis talks about Fenic - an open-source, PySpark-inspired dataframe engine designed to bring LLM-powered semantics into reliable data engineering workflows. Kostas shares why today’s data infrastructure assumptions (BI-first, expert-operated, CP ... Show More
56m 42s
Jan 5
Beyond Dashboards: How Data Teams Earn a Seat at the Table
Summary In this episode Goutham Budati about his Data–Perspective–Action framework and how it empowers data teams to become true business partners. Gautham traces his path from automating Excel reports to leading high‑impact data organizations, then breaks down why technical exce ... Show More
49m 21s
Dec 29
Unfreezing The Data Lake: The Future-Proof File Format
Summary In this episode PhD researcher Xinyu Zeng talks about F3, the “future-proof file format” designed to address today’s hardware realities and evolving workloads. He digs into the limitations of Parquet and ORC - especially CPU-bound decoding, metadata overhead for wide-tabl ... Show More
59m 24s
Recommended Episodes
Feb 2023
Shorten the distance between production data and insight
<p>Modern networked applications generate a lot of data, and every business wants to make the most of that data. Most of the time, that means moving production data through some transformation process to get it ready for the analytics process. But what if you could have in-app an ... Show More
20m 27s
Nov 2021
Time Plus Data Equals Efficiency with Paul Dix, the Founder and CTO of InfluxData and the Creator of InfluxDB
<p>If the topic of databases is brought up to certain people, their eyes may gloss over. But if that happened, that would be because they just don’t know the awesome power of databases. Data can be valuable but only if it is contextualized, and time is an extremely relevant aspec ... Show More
36m 4s
Jul 2020
What data transformation library should I use? Pandas vs Dask vs Ray vs Modin vs Rapids (Ep. 112)
<p>In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are ... Show More
21m 10s
May 2020
How Important are algorithm and data structures in backend engineering?
<p>Algorithms &amp; Data Structures are critical to Backend Engineering however it really depends on what kind of application and infrastructure you are building. In this video I want to go through the following &nbsp;&nbsp;1 Backend Engineers are two types - Integrating Existing ... Show More
13m 29s
Oct 2021
On Graph Databases | The Backend Engineering Show
<p>I get a lot of emails asking me to talk about graph databases, so I want to start researching them, but I wanted to give you guys the framework of how I think about any databases to defuse any “magic” that might be there.</p> <p>In this video, I discuss what constrains a datab ... Show More
22m 27s
Oct 2023
#628: Data on EKS
Organizations use their data to make better decisions and build innovative experiences for their customers. With the exponential growth in data, and the rapid pace of innovation in machine learning (ML), there is a growing need to build modern data applications that are agile and ... Show More
20m 56s
Jun 2022
Using AI to Supercharge Data-Driven Applications with Zilliz
Theo is in the interviewer’s chair for this episode as Frank Liu from Zilliz joins the show to talk about how AI and machine learning are making it possible for developers to understand and extract more value from unstructured data such as text, audio, images, video, and more. Tr ... Show More
20 m
Feb 2023
Better Science Volume 2: Maps, Metadata, and the Pyramid
Jump in on a second episode of the Better Science series with guest host and Technical Evangelist Justin Emerson interviewing FlashArray engineer Feng Wang about how Pure maps data at scale with a single, scalable data structure. Managing storage in modern times requires a strate ... Show More
46m 3s
Aug 2023
2476: ThoughtSpot - How AI Analytics is Redefining Business Intelligence
<p>In the rapidly evolving world of data analytics, staying ahead of the curve is essential. Today on Tech Talks Daily, I'm thrilled to have Sumeet Arora from ThoughtSpot to walk us through their game-changing announcements. ThoughtSpot is already renowned for its advanced analyt ... Show More
33m 55s