logo
episode-header-image
Mar 2020
43m 36s

Easier Stream Processing On Kafka With k...

Tobias Macey
About this episode
tail spinning
Up next
Jan 12
Semantic Operators Meet Dataframes: Building Context for Agents with FENIC
Summary In this episode Kostas Pardalis talks about Fenic - an open-source, PySpark-inspired dataframe engine designed to bring LLM-powered semantics into reliable data engineering workflows. Kostas shares why today’s data infrastructure assumptions (BI-first, expert-operated, CP ... Show More
56m 42s
Jan 5
Beyond Dashboards: How Data Teams Earn a Seat at the Table
Summary In this episode Goutham Budati about his Data–Perspective–Action framework and how it empowers data teams to become true business partners. Gautham traces his path from automating Excel reports to leading high‑impact data organizations, then breaks down why technical exce ... Show More
49m 21s
Dec 29
Unfreezing The Data Lake: The Future-Proof File Format
Summary In this episode PhD researcher Xinyu Zeng talks about F3, the “future-proof file format” designed to address today’s hardware realities and evolving workloads. He digs into the limitations of Parquet and ORC - especially CPU-bound decoding, metadata overhead for wide-tabl ... Show More
59m 24s
Recommended Episodes
Dec 2022
MongoDB Internal Architecture | The Backend Engineering Show
<p>I’m a big believer that database systems share similar core fundamentals at their storage layer and understanding them allows one to compare different DBMS objectively. For example, How documents are stored in MongoDB is no different from how MySQL or PostgreSQL store rows. Ev ... Show More
44m 13s
Jan 2023
MySQL on HTTP/3 | The Backend Engineering Show
<p>The communication between backend applications and database systems always fascinated me. The protocols keep evolving and we are in constant search for an efficient protocol that best fit the workload of Backend-DB communication.</p> <p>In this episode of the backend engineeri ... Show More
37m 10s
Jul 2021
Should you go with an Optimistic or Pessimistic Concurrency Control Database?
<p>MongoDB, Postgres, Microsoft SQL Server, or MySQL, or any other database manages concurrency control differently. There are two methods, pessimistic and optimistic, both have their pros and cons. Let explore how different databases implement this and what is the effect on perf ... Show More
21m 46s
May 2020
How Important are algorithm and data structures in backend engineering?
<p>Algorithms &amp; Data Structures are critical to Backend Engineering however it really depends on what kind of application and infrastructure you are building. In this video I want to go through the following &nbsp;&nbsp;1 Backend Engineers are two types - Integrating Existing ... Show More
13m 29s
Feb 2023
Shorten the distance between production data and insight
<p>Modern networked applications generate a lot of data, and every business wants to make the most of that data. Most of the time, that means moving production data through some transformation process to get it ready for the analytics process. But what if you could have in-app an ... Show More
20m 27s
Feb 2023
Postgres Architecture | The Backend Engineering Show
<p>Creating a listener on the backend application that accepts connections is simple. You listen on an address-port pair, connection attempts to that address and port will get added to an accept queue; The application accepts connections from the queue and start reading the data ... Show More
34m 4s
May 2022
Why this query is fast
<p>Welcome to another database question. In this question I created a community poll question and provided some answers. All answers can be correct of course but the question is what is the most efficient? this is what I try to explore in this video and compare how different data ... Show More
17m 50s
Jun 2022
YugabyteDB supports read committed isolation
YugabyteDB is a postgres compatible and cloud native database. Read committed isolation level is a critical feature and adding it might lure more postgres customer’s to move to the cloud native database. But will they compete in front of Google’s new AlloyDB ?    0:00 Yogabyte im ... Show More
11m 57s
Aug 2021
Table Clustering (Clustered Index) - The pros and cons
In this episode of the backend engineering show, I discuss database clustering. This is also known as table clustering, clustered index or Index organized table all names represents the same thing. I will talk about the benefits of clustering and also the disadvantages of impleme ... Show More
28m 33s
May 2022
#366: Optimizing PostgreSQL DB Queries with pgMustard
See the full show notes for this episode on the website at talkpython.fm/366 
1h 14m