logo
episode-header-image
Apr 8
1h 24m

Andriy Burkov - The TRUTH About Large La...

One Knight in Product
About this episode

Andriy Burkov is a renowned machine learning expert and leader. He's also the author of (so far) three books on machine learning, including the recently-released "The Hundred-Page Language Models Book", which takes curious people from the very basics of language models all the way up to building their own LLM. Andriy is also a formidable online presence and is never afraid to call BS on over-the-top claims about AI capabilities via his punchy social media posts.

Episode highlights: 1. Large Language Models are neither magic nor conscious

LLMs boil down to relatively simple mathematics at an unfathomably large scale. Humans are terrible at visualising big numbers and cannot comprehend the size of the dataset or the number of GPUs that have been used to create the models. You can train the same LLM on a handful of records and get garbage results, or throw millions of dollars at it and get good results, but the fundamentals are identical, and there's no consciousness hiding in between the equations. We see good-looking output, and we think it's talking to us. It isn't.

2. As soon as we saw it was possible to do mathematics on words, LLMs were inevitable

There were language models before LLMs, but the invention of the transformer architecture truly accelerated everything. That said, the fundamentals trace further back to "simpler" algorithms, such as word2vec, which proved that it is possible to encode language information in a numeric format, which meant that the vast majority of linguistic information could be represented by embeddings, which enabled people to run equations on language. After that, it was just a matter of time before they got scaled out.

3. LLMs look intelligent because people generally ask about things they already know about

The best way to be disappointed by an LLM's results is to ask detailed questions about something you know deeply. It's quite likely that it'll give good results to start with, because most people's knowledge is so unoriginal that, somewhere in the LLM's training data, there are documents that talk about the thing you asked about. But, it will degrade over time and confidently keep writing even when it doesn't know the answer. These are not easily solvable problems and are, in fact, fundamental parts of the design of an LLM.

4. Agentic AI relies on unreliable actors with no true sense of agency

The concept of agents is not new, and people have been talking about them for years. The key aspect of AI agents is that they need self-motivation and goals of their own, rather than being told to have goals and then simulating the desire to achieve them. That's not to say that some agents are not useful in their own right, but the goal of fully autonomous, agentic systems is a long way off, and may not even be solvable.

5. LLMs represent the most incredible technical advance since the personal computer, but people should quit it with their most egregious claims

LLMs are an incredible tool and can open up whole new worlds for people who are able to get the best out of them. There are limits to their utility, and some of their shortcomings are likely unsolvable, but we should not minimise their impact. However, there are unethical people out there making completely unsubstantiated claims based on zero evidence and a fundamental misunderstanding of how these models work. These people are scaring people and encouraging terrible decision-making from the gullible. We need to see through the hype.

Buy "The Hundred-Page Language Model Book"

"Large language models (LLMs) have fundamentally transformed how machines process and generate information. They are reshaping white-collar jobs at a pace comparable only to the revolutionary impact of personal computers. Understanding the mathematical foundations and inner workings of language models has become crucial for maintaining relevance and competitiveness in an increasingly automated workforce. This book guides you through the evolution of language models, starting from machine learning fundamentals. Rather than presenting transformers right away, which can feel overwhelming, we build understanding of language models step by step—from simple count-based methods through recurrent neural networks to modern architectures. Each concept is grounded in clear mathematical foundations and illustrated with working Python code."

Check it out on the book's website: https://thelmbook.com/.

You can also check out Machine Learning Engineering: https://www.mlebook.com and The Hundred-Page Machine Learning Book: https://www.themlbook.com/.

Follow Andriy

You can catch up with Andriy here:

Up next
Jun 27
Shobhit Chugh's Hot Take - Product Managers Should Relentlessly Self-Promote (with Shobhit Chugh, CEO @ Intentional Product Manager)
Shobhit Chugh is a former Google PM who now helps product managers build their own brand, habits, confidence and career with Intentional Product Manager, and he also runs his own podcast of the same name. Shobhit's hot take? That too many product managers think that their work wi ... Show More
21m 46s
Jun 22
Valeria Stromtsova's Hot Take - Product Managers Must Take The Lead in Designing Sustainable Solutions (with Valeria Stromtsova, Product Manager @ Treeapp)
Valeria Stromtsova is a Product Manager who traded fintech for sustainability and now works at Treeapp, helping plant trees where they're needed the most. Valeria's hot take? That product managers need to take the lead in creating a sustainable future, making sure they make a dif ... Show More
23m 4s
Jun 9
CPO Stories: Bhavesh Vaghela - London Marathon Events
Welcome to CPO Stories! In this new "podcast within a podcast", I'll be speaking to executive product leaders from the UK's biggest companies as well as up-and-coming stars of the future. I'll be digging into how they approach product management within their organisations, how th ... Show More
52m 36s
Recommended Episodes
Aug 2024
ChatGPT has a language problem — but science can fix it
AIs built on Large Language Models have wowed by producing particularly fluent text. However, their ability to do this is limited in many languages. As the data and resources used to train a model in a specific language drops, so does the performance of the model, meaning that fo ... Show More
36m 50s
May 2023
TinyML: Bringing machine learning to the edge
When we think about machine learning today we often think in terms of immense scale — large language models that require huge amounts of computational power, for example. But one of the most interesting innovations in machine learning right now is actually happening on a really s ... Show More
45m 45s
Aug 2023
Cuttlefish Model Tuning
Hongyi Wang, a Senior Researcher at the Machine Learning Department at Carnegie Mellon University, joins us. His research is in the intersection of systems and machine learning. He discussed his research paper, Cuttlefish: Low-Rank Model Training without All the Tuning, on today’ ... Show More
27m 8s
Feb 2017
MLG 002 What is AI, ML, DS
Links: Notes and resources at ocdevel.com/mlg/2 Try a walking desk stay healthy & sharp while you learn & code Try Descript audio/video editing with AI power-tools What is artificial intelligence, machine learning, and data science? What are their differences? AI history. Hierarc ... Show More
1h 5m
May 8
MLG 035 Large Language Models 2
At inference, large language models use in-context learning with zero-, one-, or few-shot examples to perform new tasks without weight updates, and can be grounded with Retrieval Augmented Generation (RAG) by embedding documents into vector databases for real-time factual lookup ... Show More
45m 25s
Sep 2024
Large Language Model (LLM) Risks and Mitigation Strategies
As machine learning algorithms continue to evolve, Large Language Models (LLMs) like GPT-4 are gaining popularity. While these models hold great promise in revolutionizing various functions and industries—ranging from content generation and customer service to research and develo ... Show More
28m 58s
Feb 2017
MLG 001 Introduction
Show notes: ocdevel.com/mlg/1. MLG teaches the fundamentals of machine learning and artificial intelligence. It covers intuition, models, math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with high ... Show More
8m 11s
Mar 2024
Figure 01 humanoid robot
Genuine-friend.com. The super capabilities of Figure 01, the humanoid robot developed by the startup Figure, are attributed to its integration with OpenAI's advanced AI technologies. Here's how OpenAI contributes to Figure 01's exceptional capabilities based on the provided searc ... Show More
16m 38s
Aug 2024
AI that connects the digital and physical worlds | Anima Anandkumar
While language models may help generate new ideas, they cannot attack the hard part of science, which is simulating the necessary physics," says AI professor Anima Anandkumar. She explains how her team developed neural operators — AI trained on the finest details of the real worl ... Show More
12m 14s