Play all

Loss Landscape and Performance in Deep Learning

Supervised Deep Learning

Set-up: Architecture

Set-up: Dataset

Learning

Learning dynamics = descent in loss landscape

Analogy with granular matter: Jamming

Theoretical results: Phase diagram

Empirical tests: MNIST parity

Landscape curvature

Flat directions

Outline

Overfitting?

Ensemble average

Fluctuations increase error

Scaling argument!

Infinitely-wide networks: Initialization

Infinitely-wide networks: Learning

Neural Tangent Kernel

Finite N asymptotics?

Conclusion

Description:

Explore the intricacies of deep learning performance and loss landscapes in this 46-minute conference talk. Delve into supervised deep learning concepts, including network architecture and dataset setup. Examine learning dynamics as a descent in the loss landscape, drawing parallels with granular matter jamming. Analyze theoretical phase diagrams and empirical tests using MNIST parity. Investigate landscape curvature, flat directions, and the potential for overfitting. Consider ensemble averages and how fluctuations impact error rates. Discover scaling arguments and explore infinitely-wide networks, including initialization and learning processes. Gain insights into the Neural Tangent Kernel and finite N asymptotics. Enhance your understanding of deep learning theory and its connections to statistical physics.

Loss Landscape and Performance in Deep Learning by Stefano Spigler

International Centre for Theoretical Sciences

Add to list

#Computer Science #Deep Learning #Machine Learning #Supervised Learning #Computer Networking #Network Architecture #Science #Materials Science #Phase Diagrams #Physics #Statistical Physics