logo
episode-header-image
Aug 2023
27m 8s

Cuttlefish Model Tuning

Kyle Polich
About this episode

Hongyi Wang, a Senior Researcher at the Machine Learning Department at Carnegie Mellon University, joins us. His research is in the intersection of systems and machine learning. He discussed his research paper, Cuttlefish: Low-Rank Model Training without All the Tuning, on today’s show.

Hogyi started by sharing his thoughts on whether developers need to learn how to fine-tune models. He then spoke about the need to optimize the training of ML models, especially as these models grow bigger. He discussed how data centers have the hardware to train these large models but not the community. He then spoke about the Low-Rank Adaptation (LoRa) technique and where it is used.

Hongyi discussed the Cuttlefish model and how it edges LoRa. He shared the use cases of Cattlefish and who should use it. Rounding up, he gave his advice on how people can get into the machine learning field. He also shared his future research ideas.

Up next
Sep 22
Interpretable Real Estate Recommendations
In this episode of Data Skeptic's Recommender Systems series, host Kyle Polich interviews Dr. Kunal Mukherjee, a postdoctoral research associate at Virginia Tech, about the paper "Z-REx: Human-Interpretable GNN Explanations for Real Estate Recommendations" The discussion explores ... Show More
32m 57s
Sep 8
Why Am I Seeing This?
In this episode of Data Skeptic, we explore the challenges of studying social media recommender systems when exposure data isn't accessible. Our guests Sabrina Guidotti, Gregor Donabauer, and Dimitri Ognibene introduce their innovative "recommender neutral user model" for inferri ... Show More
49m 36s
Aug 30
Eco-aware GNN Recommenders
In this episode of Data Skeptic, we dive into eco-friendly AI with Antonio Purificato, a PhD student from Sapienza University of Rome. Antonio discusses his research on "EcoAware Graph Neural Networks for Sustainable Recommendations" and explores how we can measure and reduce the ... Show More
44m 42s
Recommended Episodes
Feb 2025
π0: A Foundation Model for Robotics with Sergey Levine - #719
Today, we're joined by Sergey Levine, associate professor at UC Berkeley and co-founder of Physical Intelligence, to discuss π0 (pi-zero), a general-purpose robotic foundation model. We dig into the model architecture, which pairs a vision language model (VLM) with a diffusion-ba ... Show More
52m 30s
May 2023
TinyML: Bringing machine learning to the edge
When we think about machine learning today we often think in terms of immense scale — large language models that require huge amounts of computational power, for example. But one of the most interesting innovations in machine learning right now is actually happening on a really s ... Show More
45m 45s
Jul 2024
Bridging the Sim2real Gap in Robotics with Marius Memmel - #695
Today, we're joined by Marius Memmel, a PhD student at the University of Washington, to discuss his research on sim-to-real transfer approaches for developing autonomous robotic agents in unstructured environments. Our conversation focuses on his recent ASID and URDFormer papers. ... Show More
57m 21s
Aug 2024
AI that connects the digital and physical worlds | Anima Anandkumar
While language models may help generate new ideas, they cannot attack the hard part of science, which is simulating the necessary physics," says AI professor Anima Anandkumar. She explains how her team developed neural operators — AI trained on the finest details of the real worl ... Show More
12m 14s
Apr 2025
Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726
Today, we're joined by Maohao Shen, PhD student at MIT to discuss his paper, “Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search.” We dig into how Satori leverages reinforcement learning to improve language model reasoning ... Show More
51m 45s
Apr 2025
Andriy Burkov - The TRUTH About Large Language Models and Agentic AI (with Andriy Burkov, Author "The Hundred-Page Language Models Book")
Andriy Burkov is a renowned machine learning expert and leader. He's also the author of (so far) three books on machine learning, including the recently-released "The Hundred-Page Language Models Book", which takes curious people from the very basics of language models all the wa ... Show More
1h 24m
Jul 2024
AI that connects the digital and physical worlds | Anima Anandkumar
“While language models may help generate new ideas, they cannot attack the hard part of science, which is simulating the necessary physics,” says AI professor Anima Anandkumar. She explains how her team developed neural operators — AI trained on the finest details of the real wor ... Show More
11m 6s
Apr 2024
777: Generative AI in Practice, with Bernard Marr
Generative AI is reshaping our world, and Bernard Marr, world-renowned futurist and best-selling author, joins Jon Krohn to guide us through this transformation. In this episode, Bernard shares his insights on how AI is transforming industries, revolutionizing daily life, and add ... Show More
1h 8m
Sep 2024
Smart Talks with IBM: The power of Granite in business
As the scale of artificial intelligence continues to evolve, open technology like many of IBM’s Granite models are helping enhance transparency in AI and improve efficiency across businesses. In this episode of Smart Talks with IBM, Jacob Goldstein sat down with Maryam Ashoori, t ... Show More
32m 16s
Jul 2024
#229 Inside Meta's Biggest and Best Open-Source AI Model Yet with Thomas Scialom, Co-Creator of Llama3
Meta has been at the absolute edge of the open-source AI ecosystem, and with the recent release of Llama 3.1, they have officially created the largest open-source model to date. So, what's the secret behind the performance gains of Llama 3.1? What will the future of open-source A ... Show More
39m 23s