AGNTCY - Unlock agents at scale with an open Internet of Agents. Visit https://agntcy.org/ and add your support.
In this episode of Eye on AI, we sit down with Leon Song, VP of Research at Together AI, to explore how open-source models and cutting-edge infrastructure are reshaping the AI landscape.
From speculative decoding to FlashAttention and RedPajama, Leon shares how Together AI is building one of the fastest, most cost-efficient AI clouds—helping enterprises fine-tune, deploy, and scale open-source models at the level of GPT-4 and beyond.
We dive into Leon’s journey from leading DeepSpeed and AI for Science at Microsoft to driving system-level innovation at Together AI.
Topics include:
The future of open-source vs. closed-source AI models
Breakthroughs in speculative decoding for faster inference
How Together AI’s cloud platform empowers enterprises with data sovereignty and model ownership
Why open-source models like DeepSeek R1 and Llama 4 are now rivaling proprietary systems
The role of GPUs vs. ASIC accelerators in scaling AI infrastructure
Whether you’re an AI researcher, enterprise leader, or curious about where generative AI is heading, this conversation reveals the technology and strategy behind one of the most important players in the open-source AI movement.
Stay Updated:
Craig Smith on X:https://x.com/craigss
Eye on A.I. on X: https://x.com/EyeOn_AI