logo
episode-header-image
Feb 2015
13m 15s

Labels and Where To Find Them

Ben Jaffe And Katie Malone
About this episode
Supervised classification is built on the backs of labeled datasets, but a good set of labels can be hard to find. Great data is everywhere, but the corresponding labels can sometimes be really tricky. Take a few examples we've already covered, like lie detection with an MRI machine (have to take pictures of someone's brain while they try to lie, not a tri ... Show More
Up next
Jul 2020
So long, and thanks for all the fish
All good things must come to an end, including this podcast. This is the last episode we plan to release, and it doesn’t cover data science—it’s mostly reminiscing, thanking our wonderful audience (that’s you!), and marveling at how this thing that started out as a side project g ... Show More
35m 44s
Jul 2020
A Reality Check on AI-Driven Medical Assistants
The data science and artificial intelligence community has made amazing strides in the past few years to algorithmically automate portions of the healthcare process. This episode looks at two computer vision algorithms, one that diagnoses diabetic retinopathy and another that cla ... Show More
14 m
Jul 2020
A Data Science Take on Open Policing Data
A few weeks ago, we put out a call for data scientists interested in issues of race and racism, or people studying how those topics can be studied with data science methods, should get in touch to come talk to our audience about their work. This week we’re excited to bring on Tod ... Show More
23m 44s
Recommended Episodes
May 2022
27: Are humans too complicated for labels?
Labels are everywhere. We want to belong to certain communities and groups. We want to belong, period. We also want to feel understood, both by others and by ourselves. But what are some downsides of seeking out labels? 
21m 45s
Jun 2022
Using AI to Supercharge Data-Driven Applications with Zilliz
Theo is in the interviewer’s chair for this episode as Frank Liu from Zilliz joins the show to talk about how AI and machine learning are making it possible for developers to understand and extract more value from unstructured data such as text, audio, images, video, and more. Tr ... Show More
20 m
Nov 2021
Data Quality Starts At The Source
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>The most important gauge of success for a data platform is the level of trust in the accuracy of the information that it provides. In order to build and maintain that trust it is necessary to invest in defining, monitori ... Show More
58m 55s
Aug 2022
Collecting And Retaining Contextual Metadata For Powerful And Effective Data Discovery
<div class="wp-block-jetpack-markdown"><h2>Summary</h2> <p>Data is useless if it isn&#8217;t being used, and you can&#8217;t use it if you don&#8217;t know where it is. Data catalogs were the first solution to this problem, but they are only helpful if you know what you are look ... Show More
53m 24s
May 2023
#139 How Data Scientists Can Thrive in the FMCG Industry
A lot of the times when we walk into a supermarket, we don't necessarily think about the impact data science had in getting these products on shelves. However, as you’ll learn in today's episode, it's safe to say there's a myriad of applications for data science in the FMCG indus ... Show More
42m 10s
Mar 2024
Maximizing Efficiency with Color Label Printers: Revolutionizing Your Labeling Process
tail spinning
2m 51s
Dec 2017
[MINI] Parallel Algorithms
When computers became commodity hardware and storage became incredibly cheap, we entered the era of so-call "big" data. Most definitions of big data will include something about not being able to process all the data on a single machine. Distributed computing is required for such ... Show More
20m 37s
Aug 2021
Ai startup 9
deepset works to get more meaningful search results. Deepset uses transfer learning, language models, and question and answer to drive search results. Making sense of text data. deepset is an open source company. It uses natural language processing to answer questions using bert. ... Show More
8m 3s
Sep 2023
2503: LogicMonitor - Data Observability with Taggart Matthiesen
<p>Today, I dive into the complex and often nebulous world of data observability with a leading expert in the field, Taggart Matthiesen, the Chief Product Officer at LogicMonitor. With an impressive career trajectory that includes pivotal roles at Lyft, Twitter, and Salesforce, T ... Show More
22m 7s