– How to summarise papers as @y0b1byte with Notion
3
– Why do we need to go to a higher hidden dimension?
4
– Today class: recurrent neural nets
5
– Vector to sequence vec2seq
6
– Sequence to vector seq2vec
7
– Sequence to vector to sequence seq2vec2seq
8
– Sequence to sequence seq2seq
9
– Training a recurrent network: back propagation through time
10
– Training example: language model
11
– Vanishing & exploding gradients and gating mechanism
12
– The Long Short-Term Memory LSTM
13
– Jupyter Notebook and PyTorch in action: sequence classification
14
– Inspecting the activation values
15
– Closing remarks
Description:
Explore recurrent neural networks, including vanilla and gated (LSTM) architectures, in this comprehensive lecture. Dive into various sequence processing techniques such as vector-to-sequence, sequence-to-vector, and sequence-to-sequence models. Learn about backpropagation through time, language modeling, and the challenges of vanishing and exploding gradients. Discover the Long Short-Term Memory (LSTM) architecture and its gating mechanism. Gain hands-on experience with a practical demonstration using Jupyter Notebook and PyTorch for sequence classification. Understand how to summarize research papers effectively and grasp the importance of higher hidden dimensions in neural networks.
Recurrent Neural Networks, Vanilla and Gated - LSTM