Главная
Study mode:
on
1
Intro
2
Low rank models and implicit regularizati
3
Regimes of over-parametrization
4
Tensor (CP) decomposition
5
Why naïve algorithm fails
6
Why gradient descent?
7
Two-Layer Neural Network
8
Form of the objective
9
Difficulties of analyzing gradient descent
10
Lazy training fails
11
O is a high order saddle point
12
There are local minima away from 0
13
Our (high level) algorithm
14
Proof ideas
15
Escaping local minima by random correla
16
Amplify initial correlation by tensor power man
17
Conclusions and Open Problems
Description:
Explore tensor decomposition and over-parameterization in this 37-minute conference talk from the Fields Institute's Mini-symposium on Low-Rank Models and Applications. Delve into the comparison between lazy training regimes and gradient descent techniques for finding approximate tensors. Examine the challenges of analyzing gradient descent, the failures of lazy training, and the existence of local minima. Learn about a novel algorithm that escapes local minima through random correlation and amplifies initial correlation using tensor power methods. Gain insights into the importance of over-parameterization in training neural networks and its implications for avoiding bad local optimal solutions.

Beyond Lazy Training for Over-parameterized Tensor Decomposition

Fields Institute
Add to list