Play all

Intro

The ERM/SRM theory of learning

Unifom laws of large numbers

Capacity control

U-shaped generalization curve

Does interpolation overfit?

Interpolation does not averfit even for very noisy data

why bounds fail

Interpolation is best practice for deep learning

Historical recognition

The key lesson

Generalization theory for interpolation?

A way forward?

Interpolated k-NN schemes

Interpolation and adversarial examples

Double descent risk curve

More parameters are better: an example

Random Fourier networks

what is the mechanism?

Double Descent in Randon Feature settings

Smoothness by averaging

Framework for modern ML

The landscape of generalization

Optimization: classical

Modern Optimization

From classical statistics to modern ML

The nature of inductive bias

Memorization and interpolation

Interpolation in deep auto-encoders

Neural networks as models for associative memory

Why are attractors surprising?

Memorizing sequences

Description:

Explore the paradigm shift in machine learning theory presented by Professor Mikhail Belkin in this thought-provoking lecture. Delve into the apparent contradiction between classical statistical wisdom and modern deep learning practices, where over-parameterized models with near-perfect training data fit show excellent test performance. Examine the challenges this poses to traditional Empirical Risk Minimization concepts and discover the emerging "double descent" risk curve that unifies classical and modern models. Investigate the nature of inductive bias in deep learning, particularly in auto-encoders, and their potential implementation of associative memory. Gain insights into the evolving landscape of generalization theory, optimization techniques, and the role of memorization in neural networks.

Beyond Empirical Risk Minimization - The Lessons of Deep Learning

MITCBMM

Add to list

#Computer Science #Machine Learning #Deep Learning #Inductive Bias

0:00 / 0:00