logo
episode-header-image
Dec 2022
44m 13s

MongoDB Internal Architecture | The Back...

Hussein Nasser
About this episode

I’m a big believer that database systems share similar core fundamentals at their storage layer and understanding them allows one to compare different DBMS objectively. For example, How documents are stored in MongoDB is no different from how MySQL or PostgreSQL store rows. Everything goes to disk, the trick is to fetch what you need from disk efficiently with as fewer I/Os as possible, the rest is API.  In this video I discuss the evolution of MongoDB internal architecture on how documents are stored and retrieved focusing on the index storage representation. I assume the reader is well versed with fundamentals of database engineering such as indexes, B+Trees, data files, WAL etc, you may pick up my database course to learn the skills. Let us get started.



Fundamentals of Backend Engineering Design patterns udemy course (link redirects to udemy with coupon) https://backend.husseinnasser.com Fundamentals of Networking for Effective Backends udemy course (link redirects to udemy with coupon) https://network.husseinnasser.com Fundamentals of Database Engineering udemy course (link redirects to udemy with coupon) https://database.husseinnasser.com



Up next
Yesterday
CPU and Kernel Page Faults
<p>Page faults occurs when the process tries to access a memory that isn’t backed by a physical page kernel raises a fault which loads a page. It happens on first access, stack expansion, COW, swap and much more. However it comes with a cost. </p><p><br /></p><p>In this episode o ... Show More
48m 37s
Oct 31
Amazon US-EAST-1 Outage in Details
On October 19 2025 AWS experienced an outage that lasted over a day, 10 days later we finally got the root cause analysis and we know exactly what caused the DNS to fail0:00 Summary 5:30 How did Dynamo lost its DNS?13:41 EC2 Errors 16:16 Network Load Balancer ErrorsRCA here https ... Show More
24m 26s
Oct 17
Graceful shutdown in HTTP
There are cases where the backend may need to close the connection to prevent unexpected situations, prevent bad actors or simply just free up resources. Closing a connection gracefully allows clients and backends to clean up and finish any pending requests. In this episode of th ... Show More
25m 49s
Recommended Episodes
Mar 2023
Moving up a level of abstraction with serverless on MongoDB Atlas and AWS
<p>The history of computing has been a story of moving up levels of abstraction: from hard-coding algorithms and directly manipulating memory addresses with assembly languages to using more natural language constructs in high-level general purpose languages to abstracting the har ... Show More
26m 8s
Mar 2020
Easier Stream Processing On Kafka With ksqlDB
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Building applications on top of unbounded event streams is a complex endeavor, requiring careful integration of multiple disparate systems that were engineered in isolation. The ksqlDB project was created to address this ... Show More
43m 36s
Jun 2023
#420: Database Consistency & Isolation for Python Devs
See the full show notes for this episode on the website at <a href="https://talkpython.fm/420">talkpython.fm/420</a> 
56m 2s
Aug 2021
#467: [INTRODUCING] Amazon MemoryDB for Redis
Amazon MemoryDB for Redis is the newest fully managed database service from AWS. Today, Nikki is joined by Zach Gardner, Specialist Solutions Architect at AWS, to introduce this new Redis-compatible, durable, in-memory database service. Learn why we built MemoryDB and dive into b ... Show More
29m 36s
Jun 2021
Accelerating ML Training And Delivery With In-Database Machine Learning
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>When you build a machine learning model, the first step is always to load your data. Typically this means downloading files from object storage, or querying a database. To speed up the process, why not build the model in ... Show More
1h 5m
Feb 2023
Shorten the distance between production data and insight
<p>Modern networked applications generate a lot of data, and every business wants to make the most of that data. Most of the time, that means moving production data through some transformation process to get it ready for the analytics process. But what if you could have in-app an ... Show More
20m 27s
Sep 2021
S17:E9 - What are some database architectures and their use cases (Kyle Bernhardy)
In this episode, we talk about database architectures and some of their use cases, with Kyle Bernhardy, CTO of HarperDB. Kyle talks about what a database is, different types of databases, and when you might want to use one type of database over another. Show Links DevDiscuss (spo ... Show More
48m 31s
Oct 2022
Going From Transactional To Analytical And Self-managed To Cloud On One Database With MariaDB
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>The database market has seen unprecedented activity in recent years, with new options addressing a variety of needs being introduced on a nearly constant basis. Despite that, there are a handful of databases that continu ... Show More
52m 4s
Dec 2019
Building The Materialize Engine For Interactive Streaming Analytics In SQL
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Transactional databases used in applications are optimized for fast reads and writes with relatively simple queries on a small number of records. Data warehouses are optimized for batched writes and complex analytical qu ... Show More
48m 7s