Play all

Intro

Fundamental Questions

Challenges

What if the Landscape is Bad?

Gradient Descent Finds Global Minima

Idea: Study Dynamics of the Prediction

Local Geometry

Local vs Global Geometry

What about Generalization Error?

Does Overparametrization Hurt Generalization?

Background on Margin Theory

Max Margin via Logistic Loss

Intuition

Overparametrization Improves the Margin

Optimization with Regularizer

Comparison to NTK

Is Regularization Needed?

Warmup: Logistic Regression

What's Special About Gradient Descent?

Changing the Geometry: Steepest Descent

Steepest Descent: Examples

Beyond Linear Models: Deep Networks

Implicit Regularization: NTK vs Asymptotic

Does Architecture Matter?

Example: Changing the Depth in Linear Network

Example: Depth in Linear Convolutional Network

Random Thoughts

Description:

Explore the foundations of deep learning in this lecture focusing on stochastic gradient descent, overparametrization, and generalization. Delve into fundamental questions and challenges in the field, examining the landscape of optimization and how gradient descent finds global minima. Investigate the dynamics of prediction, local geometry, and the interplay between local and global geometry. Analyze generalization error and the impact of overparametrization on model performance. Gain insights into margin theory, max margin via logistic loss, and how overparametrization improves the margin. Compare optimization with regularizers to Neural Tangent Kernel (NTK) approaches. Examine the unique properties of gradient descent and explore steepest descent methods with examples. Investigate deep networks beyond linear models, discussing implicit regularization and the importance of architecture. Consider the effects of changing depth in linear and convolutional networks, concluding with thought-provoking ideas on the future of deep learning. Read more

On the Foundations of Deep Learning - SGD, Overparametrization, and Generalization

Simons Institute

Add to list

#Computer Science #Deep Learning #Algorithms #Optimization Algorithms #Machine Learning #Generalization #Stochastic Gradient Descent