Bengio et al. 2003 MLP language model paper walkthrough
3
re-building our training dataset
4
implementing the embedding lookup table
5
implementing the hidden layer + internals of torch.Tensor: storage, views
6
implementing the output layer
7
implementing the negative log likelihood loss
8
summary of the full network
9
introducing F.cross_entropy and why
10
implementing the training loop, overfitting one batch
11
training on the full dataset, minibatches
12
finding a good initial learning rate
13
splitting up the dataset into train/val/test splits and why
14
experiment: larger hidden layer
15
visualizing the character embeddings
16
experiment: larger embedding size
17
summary of our final code, conclusion
18
sampling from the model
19
google collab new!! notebook advertisement
Description:
Dive into the implementation of a multilayer perceptron (MLP) character-level language model in this comprehensive video tutorial. Learn essential machine learning concepts including model training, learning rate tuning, hyperparameters, evaluation, train/dev/test splits, and under/overfitting. Follow along as the instructor builds a training dataset, implements embedding lookup tables and hidden layers, and explores the internals of PyTorch tensors. Discover how to implement output layers, negative log likelihood loss, and F.cross_entropy. Practice overfitting on a single batch before training on the full dataset with minibatches. Explore techniques for finding optimal learning rates and splitting datasets. Experiment with larger hidden layers and embedding sizes, visualize character embeddings, and learn to sample from the trained model. Access provided resources, including GitHub repositories, Jupyter notebooks, and relevant research papers to enhance your understanding and complete suggested exercises.
Read more