logo
episode-header-image
Jul 23
40m 24s

Building Private GenAI stacks

Massive Studios
About this episode

Luke Marsden (@lmarsden, CEO @HelixML) talks about Private GenAI. What is it? Why do you need it? We also discuss integration into CI/CD pipelines, the layers of a Private GenAI Stack, and why most organizations are opting for RAG over fine-tuning LLMs.

SHOW: 943

SHOW TRANSCRIPT: The Cloudcast #943 Transcript

SHOW VIDEO: https://youtube.com/@TheCloudcastNET 

NEW TO CLOUD? CHECK OUT OUR OTHER PODCAST:  "CLOUDCAST BASICS" 

SPONSORS:

  • [DoIT] Visit doit.com (that’s d-o-i-t.com) to unlock intent-aware FinOps at scale with DoiT Cloud Intelligence.
  • [FCTR] Try FCTR.io (that's F-C-T-R dot io) free for 60 days. Modern security demands modern solutions. Check out Fctr's Tako AI, the first AI agent for Okta, on their website
  • [VASION] Vasion Print eliminates the need for print servers by enabling secure, cloud-based printing from any device, anywhere. Get a custom demo to see the difference for yourself.

SHOW NOTES:

Topic 1 - Welcome to the show Luke. Give everyone a brief intro.

Topic 2 - Let’s start with Priavte GenAI. What is it? Why should organizations out there consider it? Why not just use OpenAI GPT’s and fine tune them?

Topic 2a Follow up - Regulatory Compliance - take the opposing forces in the EU for instance to using SaaS based services based in the United States.

Topic 3 - Let’s break down the layers in a typical Private AI stack. I’m seen various ways to represent this such as infrastructure layer, MLOps layer, models, data layer (typically RAG), etc. How do you break up the stack into individual components

Topic 4 - My mind immediately jumps to similarities in the DevOps space. Abstraction layers and components like Docker and containers comes to mind, integration into CI/CD pipelines, etc. I feel like MLOps is it’s own thing with specific tools and workflows. Does this all come together and if so how?

Topic 5 - Also, what does this mean for versioning and lifecycle management of the models and the data?

Topic 6 - We are seeing more and more data pipelines with backed by multiple models, sometimes in multiple locations. How do handle this from both a scheduling and interface standpoint? Is everything hidden behind APIs for instance?


FEEDBACK?

Up next
Oct 8
Using AI Reasoning to Prevent AI Scams
Alan Lefort (CEO, @StrongestLayer) discusses how LLM-powered reasoning is transforming phishing security from reactive pattern-matching to predictive threat detection, and why traditional rule-based systems can no longer defend against sophisticated AI-generated phishing attacks. ... Show More
34 m
Oct 5
Will Cloud Providers start acquiring SaaS?
As cloud matures, could the hyperscale cloud providers start looking to acquire SaaS providers to build out a bundled application portfolio? Or are the demands of AI investment too much to pursue that strategy? SHOW: 964SHOW TRANSCRIPT: The Cloudcast #964 TranscriptSHOW VIDEO: ht ... Show More
28m 16s
Oct 1
AI & Cloud Trends for September 2025
Brian Gracely (@bgracely) and Brandon Whichard (@bwhichard, @SoftwareDefTalk) discuss the top stories in Cloud and AI from September 2025.SHOW: 963SHOW TRANSCRIPT: The Cloudcast #963 TranscriptSHOW VIDEO: https://youtube.com/@TheCloudcastNET NEW TO CLOUD? CHECK OUT OUR OTHER PODC ... Show More
43m 11s
Recommended Episodes
Apr 2025
Simplifying Data Pipelines with Durable Execution
Summary In this episode of the Data Engineering Podcast Jeremy Edberg, CEO of DBOS, about durable execution and its impact on designing and implementing business logic for data systems. Jeremy explains how DBOS's serverless platform and orchestrator provide local resilience and r ... Show More
39m 49s
Jul 2022
Writing, Learning and Tech, with Ian Miell
Ian Miell is a partner at consultancy Container Solutions, and an author of books on Bash, Git, Terraform and Docker. He explains to Craig how writing - whether runbooks, blog posts, training courses, or “real” books, can help you learn and make your team more effective. Do you h ... Show More
45m 38s
Sep 18
From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture
SummaryIn this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to m ... Show More
52m 58s
Nov 2022
Kubernetes on Vessels, with Louis Bailleul
Louis Bailleul is a Chief Enterprise Architect at PGS. After years of running highly-ranked super computers to process PGS’ seismic data, Louis’s team at PGS has lead a transition to Google Cloud. Listen in to learn about HPC in Google Cloud with GKE, and to explore using Kuberne ... Show More
42m 56s
Feb 2025
Troubleshooting Microservices with Julia Blase
A distributed system is a network of independent services that work together to achieve a common goal. Unlike a monolithic system, a distributed system has no central point of control, meaning it must handle challenges like data consistency, network latency, and system failures. ... Show More
43 m
Jun 2022
Configuration as Data, with Justin Santa Barbara
What is configuration as data, how is different from infrastructure as code, and why can’t anything just be itself anymore? We posed these questions and more to long-time Kubernetes contributor Justin Santa Barbara at KubeCon EU, and this episode is the result. Justin created the ... Show More
50m 49s
Aug 4
#732: How to gain Multi-Cluster Visibility across Kubernetes Clusters with the EKS Dashboard
In this episode, we'll explore how the new Amazon EKS Dashboard solves key challenges in managing Kubernetes at scale across multiple AWS accounts and regions. We'll discuss how it provides centralized visibility into cluster health, versions, and costs - enabling teams to improv ... Show More
24m 53s
Aug 12
Podman with Brent Baude
Podman is an open-source container management tool that allows developers to build, run, and manage containers. Unlike Docker, it supports rootless containers for improved security and is fully compatible with standards from the Open Container Initiative, or OCI. Brent Baude is a ... Show More
43m 24s
Dec 2022
Kubernetes v1.26 Electrifying, with Leonard Pahlke
Leonard Pahlke is not only the Release Lead for Kubernetes v1.26, he's also a co-chair of the CNCF TAG for Environmental Sustainability and a student working toward a Master's Degree in Computer Science at the Hamburg University of Applied Sciences. In this episode, Leonard talks ... Show More
31m 42s
Jun 2025
Vibe Coding vs Low-Code/No-Code: Security Risks and CI/CD Pipeline Impacts for Citizen Developers
Explore the evolution from traditional coding to vibe coding and its relationship with low-code/no-code (LCNC) platforms. This comprehensive analysis examines how AI-assisted development and visual programming tools are creating a new generation of citizen developers, transformin ... Show More
9m 42s