logo
episode-header-image
May 2024
1h 8m

Suhail Doshi: The Future of Computer Vis...

Daniel Bashir
About this episode

Episode 123

I spoke with Suhail Doshi about:

* Why benchmarks aren’t prepared for tomorrow’s AI models

* How he thinks about artists in a world with advanced AI tools

* Building a unified computer vision model that can generate, edit, and understand pixels.

Suhail is a software engineer and entrepreneur known for founding Mixpanel, Mighty Computing, and Playground AI (they’re hiring!).

Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions.

Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter

Outline:

* (00:00) Intro

* (00:54) Ad read — MLOps conference

* (01:30) Suhail is *not* in pivot hell but he *is* all-in on 50% AI-generated music

* (03:45) AI and music, similarities to Playground

* (07:50) Skill vs. creative capacity in art

* (12:43) What we look for in music and art

* (15:30) Enabling creative expression

* (18:22) Building a unified computer vision model, underinvestment in computer vision

* (23:14) Enhancing the aesthetic quality of images: color and contrast, benchmarks vs user desires

* (29:05) “Benchmarks are not prepared for how powerful these models will become”

* (31:56) Personalized models and personalized benchmarks

* (36:39) Engaging users and benchmark development

* (39:27) What a foundation model for graphics requires

* (45:33) Text-to-image is insufficient

* (46:38) DALL-E 2 and Imagen comparisons, FID

* (49:40) Compositionality

* (50:37) Why Playground focuses on images vs. 3d, video, etc.

* (54:11) Open source and Playground’s strategy

* (57:18) When to stop open-sourcing?

* (1:03:38) Suhail’s thoughts on AGI discourse

* (1:07:56) Outro

Links:

* Playground homepage

* Suhail on Twitter



Get full access to The Gradient at thegradientpub.substack.com/subscribe
Up next
Dec 2024
2024 in AI, with Nathan Benaich
Episode 142Happy holidays! This is one of my favorite episodes of the year — for the third time, Nathan Benaich and I did our yearly roundup of all the AI news and advancements you need to know. This includes selections from this year’s State of AI Report, some early takes on o3, ... Show More
1h 48m
Dec 2024
Philip Goff: Panpsychism as a Theory of Consciousness
Episode 141I spoke with Professor Philip Goff about:* What a “post-Galilean” science of consciousness looks like* How panpsychism helps explain consciousness and the hybrid cosmopsychist viewEnjoy!Philip Goff is a British author, idealist philosopher, and professor at Durham Univ ... Show More
1 h
Nov 2024
Some Changes at The Gradient
Hi everyone!If you’re a new subscriber or listener, welcome. If you’re not new, you’ve probably noticed that things have slowed down from us a bit recently. Hugh Zhang, Andrey Kurenkov and I sat down to recap some of The Gradient’s history, where we are now, and how things will l ... Show More
34m 25s
Recommended Episodes
Apr 2024
The Future of AI Artistry with Suhail Doshi from Playground AI
Multimodal models are making it possible to create AI art and augment creativity across artistic mediums. This week on No Priors, Sarah and Elad talk with Suhail Doshi, the founder of Playground AI, an image generator and editor. Playground AI has been open-sourcing foundation di ... Show More
24m 31s
Apr 2024
OpenAI’s Sora team thinks we’ve only seen the "GPT-1 of video models"
AI-generated videos are not just leveled-up image generators. But rather, they could be a big step forward on the path to AGI. This week on No Priors, the team from Sora is here to discuss OpenAI’s recently announced generative video model, which can take a text prompt and create ... Show More
31m 24s
Feb 2023
What does AI-powered content creation look like? with Runway ML’s Cristobal Valenzuela
For a long time, AI-generated images and video felt like a fun toy. Cool, but not something that would bring value to professional content creators. But now we are at the exciting moment where machine learning tools have the power to unlock more creative ideas.This week on the po ... Show More
46m 57s
Jun 2024
Can AI replace the camera? with Joshua Xu from HeyGen
 AI video generation models still have a long way to go when it comes to making compelling and complex videos but the HeyGen team are well on their way to streamlining the video creation process by using a combination of language, video, and voice models to create videos featurin ... Show More
27m 26s
May 2023
Chanuki Seresinhe - Head of Data Science at Zoopla - Generative AI & AI for happiness #33
Our guest today is Chanuki Seresinhe, head of Data Science at Zoopla,  a company which provides millions of users with access to properties for sale and for rent. In our conversation, we first talk about Chanuki’s PhD where she used machine learning to identify relationships betw ... Show More
57m 13s
Mar 2024
Big tech earnings and the current AI debates, with Sarah Guo and Elad Gil
Host-only episode discussing NVIDIA, Meta and Google earnings, Gemini and Mistral model launches, the open-vs-closed source debate, domain specific foundation models, if we’ll see real competition in chips, and the state of AI ROI and adoption. Don’t miss our episodes with: Mistr ... Show More
42m 14s
Jun 2023
How open-source & distributed models can win AI with MosaicML’s Naveen Rao | E1754
This Week in Startups is presented by: Vanta. Compliance and security shouldn't be a deal-breaker for startups to win new business. Vanta makes it easy for companies to get a SOC 2 report fast. TWiST listeners can get $1,000 off for a limited time at vanta.com/twist. Trovata. ... Show More
1 h
Jul 2021
#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision
Ishan Misra is a research scientist at FAIR working on self-supervised visual learning. Please support this podcast by checking out our sponsors: – Onnit: https://lexfridman.com/onnit to get up to 10% off – The Information: https://theinformation.com/lex to get 75% off first mont ... Show More
2h 35m
Apr 2024
The Large Language Model Race with Pete Huang, Founder of The Neuron
Episode 4: How is AI impacting the future of creativity and the workplace? Matt Wolfe (https://twitter.com/mattwolfe) and Nathan Lands (https://twitter.com/NathanLands) tap into the insights of Pete Huang (https://twitter.com/petehuang), founder of The Neuron, a daily newsletter ... Show More
52m 16s
Mar 2024
FEHH x WWW: AI, VR, and the Future of Web Development
In this crossover episode, Chuck and Robbie join Jem Young and Ryan Burgess from Front End Happy Hour for an engaging discussion over whiskey. They share their career backgrounds, touching on their work with major tech brands like Netflix, Amazon, and National Geographic, and the ... Show More
52m 20s