logo
episode-header-image
Sep 29
1h 2m

#521: Red Teaming LLMs and GenAI with Py...

MICHAEL KENNEDY
About this episode
English is now an API. Our apps read untrusted text; they follow instructions hidden in plain sight, and sometimes they turn that text into action. If you connect a model to tools or let it read documents from the wild, you have created a brand new attack surface. In this episode, we will make that concrete. We will talk about the attacks teams are seeing in 2025, the defenses that actually work, and how to test those defenses the same way we test code. Our guides are Tori Westerhoff and Roman Lutz from Microsoft. They help lead AI red teaming and build PyRIT, a Python framework the Microsoft AI Red Team uses to pressure test real products. By the end of this hour you will know where the biggest risks live, what you can ship this quarter to reduce them, and how PyRIT can turn security from a one time audit into an everyday engineering practice.

Episode sponsors

Sentry AI Monitoring, Code TALKPYTHON
Agntcy
Talk Python Courses

Links from the show

Tori Westerhoff: linkedin.com
Roman Lutz: linkedin.com

PyRIT: aka.ms/pyrit
Microsoft AI Red Team page: learn.microsoft.com
2025 Top 10 Risk & Mitigations for LLMs and Gen AI Apps: genai.owasp.org
AI Red Teaming Agent: learn.microsoft.com
3 takeaways from red teaming 100 generative AI products: microsoft.com
MIT report: 95% of generative AI pilots at companies are failing: fortune.com

A couple of "Little Bobby AI" cartoons
Give me candy: talkpython.fm
Tell me a joke: talkpython.fm

Watch this episode on YouTube: youtube.com
Episode #521 deep-dive: talkpython.fm/521
Episode transcripts: talkpython.fm

Theme Song: Developer Rap
🥁 Served in a Flask 🎸: talkpython.fm/flasksong

---== Don't be a stranger ==---
YouTube: youtube.com/@talkpython

Bluesky: @talkpython.fm
Mastodon: @talkpython@fosstodon.org
X.com: @talkpython

Michael on Bluesky: @mkennedy.codes
Michael on Mastodon: @mkennedy@fosstodon.org
Michael on X.com: @mkennedy
Up next
Oct 6
#522: Data Sci Tips and Tricks from CodeCut.ai
Today we’re turning tiny tips into big wins. Khuyen Tran, creator of CodeCut.ai, has shipped hundreds of bite-size Python and data science snippets across four years. We dig into open-source tools you can use right now, cleaner workflows, and why notebooks and scripts don’t have ... Show More
1h 9m
Sep 23
#520: pyx - the other side of the uv coin (announcing pyx)
A couple years ago, Charlie Marsh lit a fire under Python tooling with Ruff and then uv. Today he’s back with something on the other side of that coin: pyx. Pyx isn’t a PyPI replacement. Think server, not just index. It mirrors PyPI, plays fine with pip or uv, and aims to make in ... Show More
1 h
Sep 18
#519: Data Science Cloud Lessons at Scale
Today on Talk Python: What really happens when your data work outgrows your laptop. Matthew Rocklin, creator of Dask and cofounder of Coiled, and Nat Tabris a staff software engineer at Coiled join me to unpack the messy truth of cloud-scale Python. During the episode we actually ... Show More
1h 2m
Recommended Episodes
Sep 17
GPT-5-Codex and the Year of Agentic Coding
Today on the AI Daily Brief, OpenAI launches GPT 5 Codex, a model designed for real-world software engineering with dynamic reasoning, long-task persistence, and powerful code review capabilities. We break down why this release cements 2025 as the year of agentic coding and what ... Show More
28m 45s
Jul 20
Anthropic co-founder on quitting OpenAI, AGI predictions, $100M talent wars, 20% unemployment, and the nightmare scenarios keeping him up at night | Ben Mann
Benjamin Mann is a co-founder of Anthropic, an AI startup dedicated to building aligned, safety-first AI systems. Prior to Anthropic, Ben was one of the architects of GPT-3 at OpenAI. He left OpenAI driven by the mission to ensure that AI benefits humanity. In this episode, Ben o ... Show More
1h 14m
Aug 11
64: Using AI for Building Internal AI Teams with Diane Hammond
Chris Daigle sits down with Diane Hammons, Director of Digital Engagement at WG Content, to explore how small teams can harness AI without getting lost in the noise. Diane shares the story behind WG Content’s “AI Pathfinders” group, a volunteer-based council that tackles adoption ... Show More
55m 26s
Aug 28
7 AI Use Cases Unlocked By Nano Banana
Today's AI Daily Brief covers the groundbreaking release of Google's Nano Banana image generation model, which has taken the AI community by storm over the past few weeks. Google officially revealed that Nano Banana is actually Gemini 2.5 Flash, now available as a free preview in ... Show More
25m 24s
Sep 4
Is Google Now the AI Leader?
Google’s AI comeback is turning into something bigger. Today’s AI Daily Brief covers whether Google has taken the lead in the AI race, with multimodal breakthroughs, Gemini’s surge, and a huge antitrust win around Chrome. We also dig into Anthropic’s $13B raise at a stunning $183 ... Show More
29m 23s
Sep 3
My Autumn AI Predictions
Back to school season means back to AI predictions! After a summer of skepticism around the MIT study claiming 95% of AI pilots fail, NLW dives nto what's really coming this fall and beyond. From simmering skepticism to multimodal model progress to the potential for AI M&A, NLW b ... Show More
29m 53s
May 2023
Episode 148 - AI Voodoo With Vodo Drive
SO MUCH packed into this episode! Recently, Allen participated in a hackathon sponsored by VoiceFlow, and he used the opportunity to explore ways that LLMs could be used to build on his work talking with spreadsheets in Vodo Drive (see episode 116). He and Mark explore how he did ... Show More
56m 56s
Jul 31
Can AI Trade Stocks?
Can AI really pick winning stocks? In this episode, we dive into the wild world of AI trading—where agents like ChatGPT and Perplexity aren’t just talking about the market, they’re playing it. From bold bets to biotech wins, we explore the surprising ways AI is learning to invest ... Show More
21m 43s
Sep 18
How People Actually Use ChatGPT
This episode of AI Daily Brief dives into two important reports on how people are really using AI tools like ChatGPT and Claude. OpenAI’s massive study with Harvard and NBER reveals consumer patterns across 1.5 million conversations, while Anthropic’s Economic Index tracks broade ... Show More
27m 39s