logo
episode-header-image
Oct 2021
28m 58s

A murder mystery: who killed our user ex...

The Stack Overflow Podcast
About this episode

The infrastructure that networked applications lives on is getting more and more complicated. There was a time when you could serve an application from a single machine on premises. But now, with cloud computing offering painless scaling to meet your demand, your infrastructure becomes abstracted and not really something you have contact with directly. Compound that problem with with architecture spread across dozens, even hundreds of microservices, replicated across multiple data centers in an ever changing cloud, and tracking down the source of system failures becomes something like a murder mystery. Who shot our uptime in the foot? 

A good observability system helps with that. On this sponsored episode of the Stack Overflow Podcast, we talk with Greg Leffler of Splunk about the keys to instrumenting an observable system and how the OpenTelemetry standard makes observability easier, even if you aren’t using Splunk’s product. 

Observability is really an outgrowth of traditional monitoring. You expect that some service or system could break, so you keep an eye on it. But observability applies that monitoring to an entire system and gives you the ability to answer the unexpected questions that come up. It uses three principal ways of viewing system data: logs, traces, and metrics.

Metrics are a number and a timestamp that tell you particular details. Traces follow a request through a system. And logs are the causes and effects recorded from a system in motion. Splunk wants to add a fourth one—events—that would track specific user events and browser failures. 

Observing all that data first means you have to be able to track and extract that data by instrumenting your system to produce it. Greg and his colleagues at Splunk are huge fans of OpenTelemetry. It’s an open standard that can extract data for any observability platform. You instrument your application once and never have to worry about it again, even if you need to change your observability platform. 

Why use an approach that makes it easy for a client to switch vendors? Leffler and Splunk argue that it’s not only better for customers, but for Splunk and the observability industry as a whole. If you’ve instrumented your system with a vendor locked solution, then you may not switch, you may just let your observability program fall by the wayside. That helps exactly no one. 

As we’ve seen, people are moving to the cloud at an ever faster pace. That’s no surprise; it offers automatic scaling for arbitrary traffic volumes, high availability, and worry-free infrastructure failure recovery. But moving to the cloud can be expensive, and you have to do some work with your application to be able to see everything that’s going on inside it. Plenty of people just throw everything into the cloud and let the provider handle it, which is fine until they see the bill.

Observability based on an open standard makes it easier for everyone to build a more efficient and robust service in the cloud. Give the episode a listen and let us know what you think in the comments.

Up next
Yesterday
From punch cards to prompts: a history of how software got better
SPONSORED BY AWSRyan welcomes Darko Mesaroš, Principal Developer Advocate at AWS and all around computer history buff, to chat about history of software development improvements and how they made developers made more productive. They discuss the technologies and breakthroughs tha ... Show More
34m 12s
Aug 26
Svelte was built on “slinging code for the sheer love of it”
Rich Harris, creator of Svelte and software engineer at Vercel, joins Ryan on the show to dive into the evolution and future of web frameworks. They discuss the birth and growth of Svelte during the rise of mobile, the challenges of building robust and efficient web applications, ... Show More
35m 9s
Aug 22
Robots in the skies (and they use Transformer models)
Ryan welcomes Nathan Michael, CTO at Shield AI, to discuss what AI looks like in defense technologies, both technically and ethically. They cover how the Hivemind technology works in coordinating the autonomous decisions of drones in the field while keeping humans in the loop, wh ... Show More
26m 50s
Recommended Episodes
Jun 2022
Simplify Data Security For Sensitive Information With The Skyflow Data Privacy Vault
Summary The best way to make sure that you don’t leak sensitive data is to never have it in the first place. The team at Skyflow decided that the second best way is to build a storage system dedicated to securely managing your sensitive information and making it easy to integrate ... Show More
54m 5s
Aug 2023
Streaming alternatives to Kafka
Yaniv Ben Hemo (@yanivbh1, Founder/CEO at @memphis_Dev) talks about Memphis Cloud, an alternative architecture to delivering streaming data for applications.  SHOW: 747CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwNEW TO CLOUD? CHECK OUT - "CLOUDCAST BASICS"SHOW SPONSORS: ... Show More
36m 53s
Aug 2022
An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications
Summary Data has permeated every aspect of our lives and the products that we interact with. As a result, end users and customers have come to expect interactions and updates with services and analytics to be fast and up to date. In this episode Shruti Bhat gives her view on the ... Show More
1h 6m
Nov 2021
Exploring Processing Patterns For Streaming Data Integration In Your Data Lake
Summary One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near real time, but it can be challenging to understand th ... Show More
52m 53s
Jan 2022
Automated Data Quality Management Through Machine Learning With Anomalo
Summary Data quality control is a requirement for being able to trust the various reports and machine learning models that are relying on the information that you curate. Rules based systems are useful for validating known requirements, but with the scale and complexity of data i ... Show More
1h 2m
Jan 2024
Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel
Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. As the sophisticatio ... Show More
50m 26s
Jul 2021
Exploring The Design And Benefits Of The Modern Data Stack
Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of clou ... Show More
49m 2s
Dec 2021
Simplifying Cloud Complexity with Ami Luttwak, the Co-founder and CTO of Wiz
The word “cloud” is often uttered up in an almost reverent tone by anyone even tangentially affiliated with the IT world. A big reason for this is because cloud computing has been a tremendous boon for all manner of institutions. Getting away from on-prem servers has reduced cost ... Show More
38m 29s
Apr 2021
Moving Machine Learning Into The Data Pipeline at Cherre
Summary Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformations is actually a full-fledged machine learning projec ... Show More
48m 5s