Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore the intriguing concept of how Stochastic Gradient Descent (SGD) and weight decay techniques inadvertently compress neural networks in this insightful 55-minute conference talk by Tomer Galanti from MIT. Delve into the underlying mechanisms that contribute to this hidden compression effect, gaining a deeper understanding of how these widely-used optimization methods impact the efficiency and performance of deep learning models.

SGD and Weight Decay Secretly Compress Your Neural Network

MITCBMM

Add to list

#Computer Science #Machine Learning #Deep Learning #Algorithms #Optimization Algorithms #Stochastic Gradient Descent #Model Compression