Главная
Study mode:
on
1
Intro
2
Supervised ML
3
Interpolation and Overfitting
4
Modern ML
5
Fit without Fear
6
Overfitting perspective
7
Kernel machines
8
Interpolation in kernels
9
Interpolated classifiers work
10
what is going on?
11
Performance of kernels
12
Kernel methods for big data
13
The limits of smooth kernels
14
Eigenpro: practical implementation
15
Comparison with state-of-the-art
16
Improving speech intelligibility
17
Stochastic Gradient Descent
18
The Power of Interpolation
19
Optimality of mini-batch size 1
20
Minibatch size?
21
Real data example
22
Learning kernels for parallel computation?
23
Theory vs practice
24
Model complexity of interpolation?
25
How to test model complexity?
26
Testing model complexity for kernels
27
Levels of noise
28
Theoretical analyses fall short
29
Simplicial interpolation A-fit
30
Nearly Bayes optimal
31
Parting Thoughts I
Description:
Explore the intriguing world of over-parametrization in modern supervised machine learning through this 55-minute lecture by Mikhail Belkin from Ohio State University. Delve into the paradox of deep networks with millions of parameters that interpolate training data yet perform excellently on test sets. Discover how classical kernel methods exhibit similar properties to deep learning and offer competitive alternatives when scaled for big data. Examine the effectiveness of stochastic gradient descent in driving training errors to zero in the interpolated regime. Gain insights into the challenges of understanding deep learning and the importance of developing a fundamental grasp of "shallow" kernel classifiers in over-fitted settings. Explore the perspective that much of modern learning's success can be understood through the lens of over-parametrization and interpolation, and consider the crucial question of why classifiers in this "modern" interpolated setting generalize so well to unseen data. Read more

Fit Without Fear - An Over-Fitting Perspective on Modern Deep and Shallow Learning

MITCBMM
Add to list