Play all

Intro

If data is fuel, then we need to measure its value

Data value in the context of ML

Ingredients of Data Value in ML

Leave One Out Method

Desirable properties

Data Shapley Value

Applications of Data Shapley

UK Biobank Lung Cancer prediction

Removing low value data improves prediction

Adding high value data improves prediction

Negative Shapley identifies mislabeled data

Domain adaptation: face recognition

Dermatology classification

Clinical notes NLP

Efficiently approximating data Shapley

New frontiers of data valuation

Discussion

Description:

Explore the concept of equitable data valuation in machine learning through this 44-minute lecture by James Zou from Stanford University. Delve into the importance of measuring data value, focusing on its role in machine learning contexts. Learn about the Leave One Out Method and Data Shapley Value, understanding their desirable properties and applications. Examine real-world case studies, including UK Biobank lung cancer prediction, face recognition domain adaptation, dermatology classification, and clinical notes NLP. Discover how removing low-value data and adding high-value data impacts prediction accuracy, and how negative Shapley values can identify mislabeled data. Gain insights into efficient approximation methods for data Shapley and explore new frontiers in data valuation.

What is Your Data Worth? Equitable Data Valuation in Machine Learning

Simons Institute

Add to list

#Data Science #Computer Science #Machine Learning #Artificial Intelligence #Ethics in AI #Fairness in AI #Domain Adaptation

0:00 / 0:00