Taking a step back - how do we get slices? What are sli
16
Measuring model performance
17
Hidden Stratification: Approach
18
ML model development process, revisited
19
Another angle - how else can we evaluate?
20
"Closing the loop" - how do we update?
Description:
Explore a comprehensive framework for evaluating machine learning models under distribution shift in this Stanford University lecture. Dive into the Mandoline approach, which leverages user-defined "slicing functions" to guide importance weighting and improve performance estimation on target distributions. Learn how this method outperforms standard baselines in NLP and vision tasks, and understand its connection to interactive ML systems. Gain insights into the theoretical foundations of the framework, including density ratio estimation and its error scaling. Discover the broader implications for model evaluation, hidden stratification, and iterative model development processes in the context of deploying ML models in real-world scenarios.
Mandoline - Model Evaluation under Distribution Shift