logo
episode-header-image
Oct 2021
28m 58s

A murder mystery: who killed our user ex...

The Stack Overflow Podcast
About this episode

The infrastructure that networked applications lives on is getting more and more complicated. There was a time when you could serve an application from a single machine on premises. But now, with cloud computing offering painless scaling to meet your demand, your infrastructure becomes abstracted and not really something you have contact with directly. Compound that problem with with architecture spread across dozens, even hundreds of microservices, replicated across multiple data centers in an ever changing cloud, and tracking down the source of system failures becomes something like a murder mystery. Who shot our uptime in the foot? 

A good observability system helps with that. On this sponsored episode of the Stack Overflow Podcast, we talk with Greg Leffler of Splunk about the keys to instrumenting an observable system and how the OpenTelemetry standard makes observability easier, even if you aren’t using Splunk’s product. 

Observability is really an outgrowth of traditional monitoring. You expect that some service or system could break, so you keep an eye on it. But observability applies that monitoring to an entire system and gives you the ability to answer the unexpected questions that come up. It uses three principal ways of viewing system data: logs, traces, and metrics.

Metrics are a number and a timestamp that tell you particular details. Traces follow a request through a system. And logs are the causes and effects recorded from a system in motion. Splunk wants to add a fourth one—events—that would track specific user events and browser failures. 

Observing all that data first means you have to be able to track and extract that data by instrumenting your system to produce it. Greg and his colleagues at Splunk are huge fans of OpenTelemetry. It’s an open standard that can extract data for any observability platform. You instrument your application once and never have to worry about it again, even if you need to change your observability platform. 

Why use an approach that makes it easy for a client to switch vendors? Leffler and Splunk argue that it’s not only better for customers, but for Splunk and the observability industry as a whole. If you’ve instrumented your system with a vendor locked solution, then you may not switch, you may just let your observability program fall by the wayside. That helps exactly no one. 

As we’ve seen, people are moving to the cloud at an ever faster pace. That’s no surprise; it offers automatic scaling for arbitrary traffic volumes, high availability, and worry-free infrastructure failure recovery. But moving to the cloud can be expensive, and you have to do some work with your application to be able to see everything that’s going on inside it. Plenty of people just throw everything into the cloud and let the provider handle it, which is fine until they see the bill.

Observability based on an open standard makes it easier for everyone to build a more efficient and robust service in the cloud. Give the episode a listen and let us know what you think in the comments.

Up next
Yesterday
How your favorite movie is changing language learning technology
Koel Labs uses classic movies to help learners master pronunciation. You can join the waitlist for their closed beta launch now. Check out their open-source community project for Koel Labs on GitHub.Check out their project on the Mozilla Builders site. Connect with Aruna on Linke ... Show More
23m 28s
Jul 10
There is no golden path anymore: Engineering practices are being rewritten
In this episode of Leaders of Code, Ben Matthews, Senior Director of Engineering at Stack Overflow, and Loïc Houssier, CTO at Superhuman, dive into how engineering teams can navigate paradigm shifts in a world of constant technological change. They discuss the importance of leade ... Show More
36m 43s
Jul 8
Attention isn’t all we need; we need ownership too
NEAR is the blockchain for AI, enabling AI agents to transact freely across networks.Connect with Illia on LinkedIn and X, and read the original Transformers paper that Illia co-authored in 2017.Today’s shoutout goes to Populous badge winner Adi Lester for answering the question ... Show More
36m 32s
Recommended Episodes
Jun 2022
Simplify Data Security For Sensitive Information With The Skyflow Data Privacy Vault
Summary The best way to make sure that you don’t leak sensitive data is to never have it in the first place. The team at Skyflow decided that the second best way is to build a storage system dedicated to securely managing your sensitive information and making it easy to integrate ... Show More
54m 5s
Aug 2023
Streaming alternatives to Kafka
Yaniv Ben Hemo (@yanivbh1, Founder/CEO at @memphis_Dev) talks about Memphis Cloud, an alternative architecture to delivering streaming data for applications.  SHOW: 747CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwNEW TO CLOUD? CHECK OUT - "CLOUDCAST BASICS"SHOW SPONSORS: ... Show More
36m 53s
Aug 2022
An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications
Summary Data has permeated every aspect of our lives and the products that we interact with. As a result, end users and customers have come to expect interactions and updates with services and analytics to be fast and up to date. In this episode Shruti Bhat gives her view on the ... Show More
1h 6m
Nov 2021
Exploring Processing Patterns For Streaming Data Integration In Your Data Lake
Summary One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near real time, but it can be challenging to understand th ... Show More
52m 53s
Jan 2022
Automated Data Quality Management Through Machine Learning With Anomalo
Summary Data quality control is a requirement for being able to trust the various reports and machine learning models that are relying on the information that you curate. Rules based systems are useful for validating known requirements, but with the scale and complexity of data i ... Show More
1h 2m
Jan 2024
Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel
Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. As the sophisticatio ... Show More
50m 26s
Jul 2021
Exploring The Design And Benefits Of The Modern Data Stack
Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of clou ... Show More
49m 2s
Dec 2021
Simplifying Cloud Complexity with Ami Luttwak, the Co-founder and CTO of Wiz
The word “cloud” is often uttered up in an almost reverent tone by anyone even tangentially affiliated with the IT world. A big reason for this is because cloud computing has been a tremendous boon for all manner of institutions. Getting away from on-prem servers has reduced cost ... Show More
38m 29s
Apr 2021
Moving Machine Learning Into The Data Pipeline at Cherre
Summary Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformations is actually a full-fledged machine learning projec ... Show More
48m 5s