logo
episode-header-image
Jun 2024
1h 10m

Gen AI at the Edge: Qualcomm AI Research...

Sam Charrington
About this episode

Today we’re joined by Fatih Porikli, senior director of technology at Qualcomm AI Research. In our conversation, we covered several of the Qualcomm team’s 16 accepted main track and workshop papers at this year’s CVPR conference. The papers span a variety of generative AI and traditional computer vision topics, with an emphasis on increased training and inference efficiency for mobile and edge deployment. We explore efficient diffusion models for text-to-image generation, grounded reasoning in videos using language models, real-time on-device 360° image generation for video portrait relighting, unique video-language model for situated interactions like fitness coaching, and visual reasoning model and benchmark for interpreting complex mathematical plots, and more! We also touched on several of the demos the team will be presenting at the conference, including multi-modal vision-language models (LLaVA) and parameter-efficient fine tuning (LoRA) on mobile phones.


The complete show notes for this episode can be found at https://twimlai.com/go/688.

Up next
Aug 19
Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter - #743
Today, we're joined by Jack Parker-Holder and Shlomi Fruchter, researchers at Google DeepMind, to discuss the recent release of Genie 3, a model capable of generating “playable” virtual worlds. We dig into the evolution of the Genie project and review the current model’s scaled-u ... Show More
1h 1m
Aug 12
Closing the Loop Between AI Training and Inference with Lin Qiao - #742
In this episode, we're joined by Lin Qiao, CEO and co-founder of Fireworks AI. Drawing on key lessons from her time building PyTorch, Lin shares her perspective on the modern generative AI development lifecycle. She explains why aligning training and inference systems is essentia ... Show More
1h 1m
Jul 29
Context Engineering for Productive AI Agents with Filip Kozera - #741
In this episode, Filip Kozera, founder and CEO of Wordware, explains his approach to building agentic workflows where natural language serves as the new programming interface. Filip breaks down the architecture of these "background agents," explaining how they use a reflection lo ... Show More
46m 1s
Recommended Episodes
Apr 2024
The Future of AI Artistry with Suhail Doshi from Playground AI
Multimodal models are making it possible to create AI art and augment creativity across artistic mediums. This week on No Priors, Sarah and Elad talk with Suhail Doshi, the founder of Playground AI, an image generator and editor. Playground AI has been open-sourcing foundation di ... Show More
24m 31s
May 2024
Apple-OpenAI Chatbot Deal, Stanford's AR Glasses, Muon Collider Promise, and Jim Simons' Legacy
In this episode of Discover Daily, we explore the remarkable legacy of mathematician Jim Simons, who passed away on May 10, 2024, leaving behind groundbreaking contributions in mathematics, finance, and philanthropy. We also discuss the promise of muon colliders in advancing part ... Show More
7m 45s
Aug 2022
NVIDIA’s Clément Farabet on Orchestrating AI Training for Autonomous Vehicles - Ep. 175
Autonomous vehicles are one of the most complex AI challenges of our time. The networks running in the car must act as an intricate symphony, requiring intensive training, testing and validation on massive amounts of data to operate safely in the real world. Clément Farabet is th ... Show More
29m 36s
Dec 2023
Joanna Zylinska, "The Perception Machine: Our Photographic Future between the Eye and AI" (MIT Press, 2023)
A provocative investigation of the future of photography and human perception in the age of AI.We are constantly photographing and being photographed while feeding machine learning databases with our data, which in turn is used to generate new images. Analyzing the transformation ... Show More
1h 2m
Feb 2023
In Machines We Trust: The AI in the newsroom
We asked ChatGPT to summarize this episode and this is what it wrote: "In the episode, the host discussed the increasing use of AI language models like ChatGPT in newsrooms. The host explained that ChatGPT, a large language model developed by OpenAI, is being used to automate tas ... Show More
18m 13s
Jul 2023
#130 Mathew Lodge: The Future of Large Language Models in AI
Welcome to episode #130 of Eye on AI with Mathew Lodge. In this episode, we explore the world of reinforcement learning and code generation. Mathew Lodge, the CEO of Diffblue, shares insights into how reinforcement learning fuels generative AI. As we explore the intricacies of re ... Show More
49m 44s
Apr 2023
The Future of Intelligent Vehicle Interiors: Building Trust With HMI & AI - Ep. 194
Imagine a future where your vehicle's interior offers personalized experiences and builds trust through human-machine interfaces and artificial intelligence. In this episode of the NVIDIA AI Podcast, host Katie Burke Washabaugh and guest Andreas Binner, Chief Technology Officer a ... Show More
20m 54s
Feb 2022
Embedded Machine Learning: Part 5 - Machine Learning Compiler Optimization (Ep. 186)
This is the last episode of the series "Embedded ML" and I made it for the bravest :) I speak about machine learning compiler optimization to a much greater detail. Enjoy the episode!   Chat with me Join us on Discord community chat to discuss the show, suggest new episodes and c ... Show More
49m 12s
Sep 2023
TWiT 946: AI is Number Two - iPhone 15 hands on, Amazon hardware event, Instacart IPO, Unity blinks
iPhone 15 hands on, Amazon hardware event, Instacart IPO, Unity blinks     Discussion on hype cycle of AI; evaluating current state Defining artificial intelligence vs regular computation AI winters and peaks of inflated expectations for AI Evaluating strengths and weaknesses of ... Show More
2h 13m
May 2024
Suhail Doshi: The Future of Computer Vision
Episode 123I spoke with Suhail Doshi about:* Why benchmarks aren’t prepared for tomorrow’s AI models* How he thinks about artists in a world with advanced AI tools* Building a unified computer vision model that can generate, edit, and understand pixels. Suhail is a software engin ... Show More
1h 8m