Play all

intro

starter code

fixing the initial loss

fixing the saturated tanh

calculating the init scale: “Kaiming init”

batch normalization

batch normalization: summary

real example: resnet50 walkthrough

summary of the lecture

just kidding: part2: PyTorch-ifying the code

viz #1: forward pass activations statistics

viz #2: backward pass gradient statistics

the fully linear case of no non-linearities

viz #3: parameter activation and gradient statistics

viz #4: update:data ratio over time

bringing back batchnorm, looking at the visualizations

summary of the lecture for real this time

Description:

Dive deep into the internals of multi-layer perceptrons (MLPs) in this comprehensive video lecture. Explore the statistics of forward pass activations and backward pass gradients, while learning about potential pitfalls in improperly scaled networks. Discover essential diagnostic tools and visualizations for assessing the health of deep networks. Understand the challenges of training deep neural networks and learn about Batch Normalization, a key innovation that simplifies the process. Gain practical insights through code examples, real-world applications, and visualizations. Complete provided exercises to reinforce your understanding of weight initialization and BatchNorm implementation. Follow along as the lecture covers topics such as Kaiming initialization, PyTorch implementation, and various visualization techniques for network analysis.

Building Makemore - Activations & Gradients, BatchNorm

Andrej Karpathy

Add to list

#Computer Science #Artificial Intelligence #Natural Language Processing (NLP) #Deep Learning #Neural Networks #Batch Normalization #Multilayer Perceptrons

0:00 / 0:00