Has AI benchmarking reached its limit, and what do we have to fill this gap? Sinan Ozdemir speaks to Jon Krohn about the lack of transparency in training data and the necessity of human-led quality assurance to detect AI hallucinations, when and why to be skeptical of AI benchmarks, and the future of benchmarking agentic and multimodal models.
Additional ... Show More
Aug 19
915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi
Tech leader, investor, and Generationship cofounder Michelle Yi talks to Jon Krohn about finding ways to trust and secure AI systems, the methods that hackers use to jailbreak code, and what users can do to build their own trustworthy AI systems. Learn all about “red teaming” and ... Show More
1h 9m
Aug 15
914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz
In this Five-Minute Friday, Cofounder and CTO of lakeFS Oz Katz talks to Jon Krohn about data warehouses, data lakes, and how companies can handle increasingly complex data infrastructures and formats. Hear about lakeFS’s collaboration with Legofest, lakeFS’s approach to helping ... Show More
25m 52s
Aug 2024
Metrics Driven Development
How do you systematically measure, optimize, and improve the performance of LLM applications (like those powered by RAG or tool use)? Ragas is an open source effort that has been trying to answer this question comprehensively, and they are promoting a “Metrics Driven Development” ... Show More
42m 12s
Nov 2024
The Future of AI: Predictions and Realities
In this episode, Jaeden Schafer discusses the current challenges and developments in the AI industry, particularly focusing on the limitations faced by major players like OpenAI and Anthropic. The conversation explores the anticipated improvements in AI models, the predictions fo ... Show More
18m 14s
Apr 2024
Measuring The Speed of AI Through Benchmarks
David Kanter, Executive Director at MLCommons, discusses the work they’re doing with MLPerf Benchmarks, creating the world’s first industry standard approach to measuring AI speed and safety. He also shares ways they’re testing AI and LLMs for harm, to measure—and, over time, red ... Show More
31m 45s
Jul 2023
AI Today Podcast: How AI is Transforming Insurance, Interview with Connor Atchison, Wisedocs
AI is proving transformational in every industry, including long established industries, and insurance is no exception. AI is able to optimize underwriting processes, enable more personalized insurance offerings, enhance the overall customer experience, as well as help with proce ... Show More
30m 19s
Dec 2024
Navigating AI Safety and Security Challenges with Yonatan Zunger [The BlueHat Podcast]
While we are on our winter publishing break, please enjoy an episode of our N2K CyberWire network show, The BlueHat Podcast by Microsoft and MSRC. See you in 2025!
Yonatan Zunger, CVP of AI Safety & Security at Microsoft joins Nic Fillingham and Wendy Zenone on this week's episod ... Show More
53m 34s
Feb 2025
Grok 3: The New AI Challenger
In this episode, Jaeden discusses the launch of Grok 3, the latest AI model from X AI, highlighting its capabilities, training methods, and performance benchmarks compared to competitors like OpenAI's ChatGPT. He shares personal experiences using Grok 3, including its reasoni ... Show More
16m 45s