Play all

Intro

Outline

The ML model development process

Model Evaluation

Motivation

Common approach: importance weighting

Motivating example

Mandoline: Slice-based reweighting framework

The theory behind using slices

More formally...

Density Ratio Estimation

Experiments: tasks

Experiments: compare to reweighting on x

Summary

Taking a step back - how do we get slices? What are sli

Measuring model performance

Hidden Stratification: Approach

ML model development process, revisited

Another angle - how else can we evaluate?

"Closing the loop" - how do we update?

Description:

Explore a comprehensive framework for evaluating machine learning models under distribution shift in this Stanford University lecture. Dive into the Mandoline approach, which leverages user-defined "slicing functions" to guide importance weighting and improve performance estimation on target distributions. Learn how this method outperforms standard baselines in NLP and vision tasks, and understand its connection to interactive ML systems. Gain insights into the theoretical foundations of the framework, including density ratio estimation and its error scaling. Discover the broader implications for model evaluation, hidden stratification, and iterative model development processes in the context of deploying ML models in real-world scenarios.

Mandoline - Model Evaluation under Distribution Shift

Stanford University

Add to list

#Computer Science #Machine Learning #Health & Medicine #Health Care #Artificial Intelligence #Natural Language Processing (NLP) #Model Evaluation

0:00 / 0:00