logo
episode-header-image
Sep 16
56m 15s

The Startup Powering The Data Behind AGI

Lukas Biewald
About this episode

In this episode of Gradient Dissent, Lukas Biewald talks with the CEO & founder of Surge AI, the billion-dollar company quietly powering the next generation of frontier LLMs. They discuss Surge's origin story, why traditional data labeling is broken, and how their research-focused approach is reshaping how models are trained.

You’ll hear why inter-annotator agreement fails in high-complexity tasks like poetry and math, why synthetic data is often overrated, and how Surge builds rich RL environments to stress-test agentic reasoning. They also go deep on what kinds of data will be critical to future progress in AI—from scientific discovery to multimodal reasoning and personalized alignment.


It’s a rare, behind-the-scenes look into the world of high-quality data generation at scale—straight from the team most frontier labs trust to get it right.


Timestamps:

00:00 – Intro: Who is Edwin Chen?

03:40 – The problem with early data labeling systems

06:20 – Search ranking, clickbait, and product principles

10:05 – Why Surge focused on high-skill, high-quality labeling

13:50 – From Craigslist workers to a billion-dollar business

16:40 – Scaling without funding and avoiding Silicon Valley status games

21:15 – Why most human data platforms lack real tech

25:05 – Detecting cheaters, liars, and low-quality labelers

28:30 – Why inter-annotator agreement is a flawed metric

32:15 – What makes a great poem? Not checkboxes

36:40 – Measuring subjective quality rigorously

40:00 – What types of data are becoming more important

44:15 – Scientific collaboration and frontier research data

47:00 – Multimodal data, Argentinian coding, and hyper-specificity

50:10 – What's wrong with LMSYS and benchmark hacking

53:20 – Personalization and taste in model behavior

56:00 – Synthetic data vs. high-quality human data


Follow Weights & Biases:

https://twitter.com/weights_biases

https://www.linkedin.com/company/wandb

Up next
Nov 18
The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava
<p>In this episode of Gradient Dissent, Lukas Biewald talks with Tuhin Srivastava, CEO and founder of Baseten, one of the fastest-growing companies in the AI inference ecosystem. Tuhin shares the real story behind Baseten’s rise and how the market finally aligned with the infrast ... Show More
59m 13s
Aug 5
Arvind Jain on Building Glean and the Future of Enterprise AI
In this episode of Gradient Dissent, Lukas Biewald sits down with Arvind Jain, CEO and founder of Glean. They discuss Glean's evolution from solving enterprise search to building agentic AI tools that understand internal knowledge and workflows. Arvind shares how his early use of ... Show More
43m 41s
Jul 2025
How DeepL Built a Translation Powerhouse with AI with CEO Jarek Kutylowski
In this episode of Gradient Dissent, Lukas Biewald talks with Jarek Kutylowski, CEO and founder of DeepL, an AI-powered translation company. Jarek shares DeepL’s journey from launching neural machine translation in 2017 to building custom data centers and how small teams can not ... Show More
42m 42s
Recommended Episodes
Aug 2024
AI in Action: From Machine Learning Interpretability to Cybersecurity with Serg Masís and Nirmal Budhathoki
In this DSS Podcast, Anna Anisin welcomes Serg Masís, Climate and Agronomic Data Scientist at Syngenta. Serg, an expert in machine learning interpretability and responsible AI, shares his diverse background and journey into data science. He discusses the challenges of building fa ... Show More
25m 37s
Nov 7
Agentic AI and the Strategy Behind Smarter Talent Decisions - with Sachit Kamat of Eightfold AI
Today's guest is Sachit Kamat, Chief Product Officer at Eightfold AI. Eightfold AI provides a complete AI platform for talent management to help companies find, recruit, and retain workers with the goal of increasing the efficiency and effectiveness of talent operations. Sachit j ... Show More
24m 49s
Jun 2025
CVS Health and Aible are Delivering Enterprise AI with Rapid Prototyping, Agents, and Reasoning Models - Ep. 261
Tony Ambrozie from CVS Health and Arijit Sengupta from Aible share how their partnership is transforming enterprise AI development through rapid prototyping and human-centered design. Discover their proven methodology for moving from concept to production in just 30 days, why the ... Show More
39m 34s
Jun 2025
How to Design an AI-Native Engineering Organization
<p>NLW is joined by Sid Pardeshi and Brian Elliot from Blitzy.com to discuss the radically changes coming to AI engineering organizations. From copilots to agent swarms, this is a conversation about the opportunities and challenges facing all enterprise engineering groups as they ... Show More
38m 16s
Jul 2025
How I'm Building a Zero-Employee Business with AI
Want to Automate your work with AI? Get the playbook here: https://clickhubspot.com/wgk Episode 66: Can you really build a zero-employee business with AI? Nathan Lands (https://x.com/NathanLands) sits down with John Rush (https://x.com/johnrushx), founder and self-proclaimed buil ... Show More
46 m
May 2024
GSK’s Use of AI in Vaccine Tech, Drug Discovery
GSK’s Chief Digital and Technology Officer Shobie Ramakrishnan discusses how the company is leveraging AI and data models for vaccine development and drug discovery in this episode of Bloomberg Intelligence’s Tech Disruptors podcast. BI’s Health-Care Analyst Sam Fazeli and Techno ... Show More
42m 42s
Sep 2023
The Future of AI in Coding with Bito CEO Amar Goel
<p>In this episode, we dive into the journey of Amar Goel, CEO of Bito AI, and how his company raised $3.2 million to create a platform that trains directly on your codebase. We explore the challenges and opportunities of fundraising, as well as the impact of Bito AI's unique app ... Show More
22m 54s
Dec 2020
Applying AI to Merchant Services with Adrian Talapan of Fee Navigator: Ep 145
Artificial Intelligence (AI) and Machine Learning have tremendous potential and applications to the business world. On this episode, Adrian Talapan, Co-founder and CEO of FeeNavigator, joins to discuss how they've applied AI to their tool which offers instant merchant statement a ... Show More
43m 23s
Sep 23
How Microsoft is Fixing the Biggest AI Agent Problem
Want the guide to create AI Agents? get it here: https://clickhubspot.com/fhc Episode 77: Are we nearing a future where AI agents can autonomously tackle our biggest challenges—while remaining efficient, safe, and truly aligned with human goals? Matt Wolfe (https://x.com/mreflow) ... Show More
30m 8s
Jul 2025
Alembic and the Future of AI in Marketing - Ep. 263
Tomás Puig, founder and CEO of Alembic, joins the NVIDIA AI Podcast to discuss the intersection of AI, data, and marketing. He shares how Alembic uses advanced mathematics and AI—particularly spiking neural networks and causal inference—to help brands extract actionable insights ... Show More
39m 44s