Play all

Intro

Low rank models and implicit regularizati

Regimes of over-parametrization

Tensor (CP) decomposition

Why naïve algorithm fails

Why gradient descent?

Two-Layer Neural Network

Form of the objective

Difficulties of analyzing gradient descent

Lazy training fails

O is a high order saddle point

There are local minima away from 0

Our (high level) algorithm

Proof ideas

Escaping local minima by random correla

Amplify initial correlation by tensor power man

Conclusions and Open Problems

Description:

Explore tensor decomposition and over-parameterization in this 37-minute conference talk from the Fields Institute's Mini-symposium on Low-Rank Models and Applications. Delve into the comparison between lazy training regimes and gradient descent techniques for finding approximate tensors. Examine the challenges of analyzing gradient descent, the failures of lazy training, and the existence of local minima. Learn about a novel algorithm that escapes local minima through random correlation and amplifies initial correlation using tensor power methods. Gain insights into the importance of over-parameterization in training neural networks and its implications for avoiding bad local optimal solutions.

Beyond Lazy Training for Over-parameterized Tensor Decomposition

Fields Institute

Add to list

#Mathematics #Algebra #Linear Algebra #Tensor Decomposition #Computer Science #Machine Learning #Artificial Intelligence #Neural Networks #Gradient Descent #Implicit Regularization