logo
episode-header-image
Mar 2023
1h 13m

Unlocking The Potential Of Streaming Dat...

Tobias Macey
About this episode

Summary

The promise of streaming data is that it allows you to react to new information as it happens, rather than introducing latency by batching records together. The peril is that building a robust and scalable streaming architecture is always more complicated and error-prone than you think it's going to be. After experiencing this unfortunate reality for themselves, Abhishek Chauhan and Ashish Kumar founded Grainite so that you don't have to suffer the same pain. In this episode they explain why streaming architectures are so challenging, how they have designed Grainite to be robust and scalable, and how you can start using it today to build your streaming data applications without all of the operational headache.

Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Businesses that adapt well to change grow 3 times faster than the industry average. As your business adapts, so should your data. RudderStack Transformations lets you customize your event data in real-time with your own JavaScript or Python code. Join The RudderStack Transformation Challenge today for a chance to win a $1,000 cash prize just by submitting a Transformation to the open-source RudderStack Transformation library. Visit dataengineeringpodcast.com/rudderstack today to learn more
  • Hey there podcast listener, are you tired of dealing with the headache that is the 'Modern Data Stack'? We feel your pain. It's supposed to make building smarter, faster, and more flexible data infrastructures a breeze. It ends up being anything but that. Setting it up, integrating it, maintaining it—it’s all kind of a nightmare. And let's not even get started on all the extra tools you have to buy to get it to do its thing. But don't worry, there is a better way. TimeXtender takes a holistic approach to data integration that focuses on agility rather than fragmentation. By bringing all the layers of the data stack together, TimeXtender helps you build data solutions up to 10 times faster and saves you 70-80% on costs. If you're fed up with the 'Modern Data Stack', give TimeXtender a try. Head over to dataengineeringpodcast.com/timextender where you can do two things: watch us build a data estate in 15 minutes and start for free today.
  • Join in with the event for the global data community, Data Council Austin. From March 28-30th 2023, they'll play host to hundreds of attendees, 100 top speakers, and dozens of startups that are advancing data science, engineering and AI. Data Council attendees are amazing founders, data scientists, lead engineers, CTOs, heads of data, investors and community organizers who are all working together to build the future of data. As a listener to the Data Engineering Podcast you can get a special discount of 20% off your ticket by using the promo code dataengpod20. Don't miss out on their only event this year! Visit: dataengineeringpodcast.com/data-council today
  • Your host is Tobias Macey and today I'm interviewing Ashish Kumar and Abhishek Chauhan about Grainite, a platform designed to give you a single place to build streaming data applications

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what Grainite is and the story behind it?
  • What are the personas that you are focused on addressing with Grainite?
  • What are some of the most complex aspects of building streaming data applications in the absence of something like Grainite?

    • How does Grainite work to reduce that complexity?
  • What are some of the commonalities that you see in the teams/organizations that find their way to Grainite?

  • What are some of the higher-order projects that teams are able to build when they are using Grainite as a starting point vs. where they would be spending effort on a fully managed streaming architecture?

  • Can you describe how Grainite is architected?

    • How have the design and goals of the platform changed/evolved since you first started working on it?
  • What does your internal build vs. buy process look like for identifying where to spend your engineering resources?

  • What is the process for getting Grainite set up and integrated into an organizations technical environment?

    • What is your process for determining which elements of the platform to expose as end-user features and customization options vs. keeping internal to the operational aspects of the product?
  • Once Grainite is running, can you describe the day 0 workflow of building an application or data flow?

    • What are the day 2 - N capabilities that Grainite offers for ongoing maintenance/operation/evolution of those applications?
  • What are the most interesting, innovative, or unexpected ways that you have seen Grainite used?

  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Grainite?

  • When is Grainite the wrong choice?

  • What do you have planned for the future of Grainite?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
  • To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Sponsored By:

Support Data Engineering Podcast

Up next
Jul 6
Foundational Data Engineering At 2Sigma
SummaryIn this episode of the Data Engineering Podcast Effie Baram, a leader in foundational data engineering at Two Sigma, talks about the complexities and innovations in data engineering within the finance sector. She discusses the critical role of data at Two Sigma, balancing ... Show More
55m 5s
Jun 29
Enabling Agents In The Enterprise With A Platform Approach
SummaryIn this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with agentic capabilities. From leading AI engineering at Deutsche Telekom to his current entrepreneurial venture focused on multi-agen ... Show More
54m 18s
Jun 18
Dagster's New Era: Modularizing Data Transformation in the Age of AI
SummaryIn this episode of the Data Engineering Podcast we welcome back Nick Schrock, CTO and founder of Dagster Labs, to discuss the evolving landscape of data engineering in the age of AI. As AI begins to impact data platforms and the role of data engineers, Nick shares his insi ... Show More
1h 1m
Recommended Episodes
Sep 2023
Hot Takes, Ember Data, and Open Source with Chris Thoburn (Runspired)
After years in the tech game, senior developers know that it’s important to find a balance between innovation and stability in engineering. How can developers strike the balance between embracing new tools and ensuring the steadfastness of their applications over the long haul? C ... Show More
1h 8m
Mar 2022
Bayesian Machine Learning with Ravin Kumar (Ep. 191)
This is one episode where passion for math, statistics and computers are merged. I have a very interesting conversation with Ravin,  data scientist at Google where he uses data to inform decisions. He has previously worked at Sweetgreen, designing systems that would benefit team ... Show More
31m 12s
Mar 2022
Mining the Golden Age of Data with Tableau’s CEO & President Mark Nelson
Mark Nelson is the President and CEO of Tableau, a company dedicated to democratizing analytics and putting data back in the hands of consumers. But while this digital pioneer may be excited about the technical side of things, he’s more excited about how accessing data (and askin ... Show More
36m 32s
Feb 2024
Episode 506: Unwinding Flakey Tests with Alan Ridlehoover & Fito von Zastrow
Fito and Alan are frequent RubyConf and RailsConf speakers on topics ranging from software complexity to resolving flaky specs. They joined the how to discuss strategies for dealing with unreliable tests and complex code. Show Notes Cisco Meraki: Careers (https://meraki.com/caree ... Show More
32m 19s
Nov 2021
577. Insights: Big Data is just getting started
Our expert host, Gwera Kiwana, is joined by some great guests to talk about all things Big Data. Large banks were some of the first big believers in the power of Big Data – but issues meant that initial hopes weren’t quite achieved. However, the potential of Big Data has progress ... Show More
47m 41s
May 2024
Stack Overflow Signs Deal with OpenAI to Sell User Data
In this episode, we explore the recent partnership between Stack Overflow and OpenAI, detailing how Stack Overflow's vast repository of developer insights and coding solutions will be utilized to enhance OpenAI's models. We'll dive into the implications of this collab ... Show More
6m 12s
Nov 2023
What Engineering Leaders Can Expect In 2024 | Predictions from Ori Keren
What trends do engineering leaders need to pay attention to, and how will they impact your teams in 2024? This week, co-host Conor Bronsdon is joined by LinearB co-founder and CEO Ori Keren to discuss his predictions for next year.  Together they discuss why dev team metrics are ... Show More
30m 30s
Mar 2024
AI vs software devs
Daniel and Chris are out this week, so we’re bringing you conversations all about AI’s complicated relationship to software developers from other Changelog pods: JS Party, Go Time & The Changelog.Join the discussionChangelog++ members save 2 minutes on this episode because they m ... Show More
57 m
Mar 2024
LLM Security and Privacy
Sean Falconer (@seanfalconer, Head of Dev Relations @SkyflowAPI, Host @software_daily) talks about security and privacy of LLMs and how to prevent PII (personally identifiable information) from leaking outSHOW: 807 CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw NEW TO CLO ... Show More
26m 9s