Play all

uTransfer introduced

Previous work tensor programs IV

NTK - neural tangent kernel recap

abc parametrization

How does learning happen in NTK?

Connections to Central Limit Theorem

Maximal Update Parametrization in Practice

DeepNet paper connection

Results width is all you need?

Description:

Dive into an in-depth exploration of the "Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer (μTransfer)" paper in this comprehensive video lecture. Learn about the groundbreaking approach that makes optimal hyperparameters stable with respect to width scaling. Explore previous work on tensor programs, revisit the neural tangent kernel (NTK) concept, and understand the abc parametrization. Discover how learning occurs in NTK and its connections to the Central Limit Theorem. Gain practical insights into Maximal Update Parametrization and its relationship to the DeepNet paper. Analyze the results that suggest width might be the key factor in neural network performance. Enhance your understanding of advanced machine learning concepts and cutting-edge research in neural network optimization.

Tensor Programs - Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

Aleksa Gordić - The AI Epiphany

Add to list

#Computer Science #Artificial Intelligence #Neural Networks #Machine Learning #Zero-shot learning (ZSL) #Hyperparameter Optimization

0:00 / 0:00