Главная
Study mode:
on
1
Awesome song and introduction
2
Word Embedding
3
Positional Encoding
4
Self-Attention
5
Encoder and Decoder defined
6
Decoder Word Embedding
7
Decoder Positional Encoding
8
Transformers were designed for parallel computing
9
Decoder Self-Attention
10
Encoder-Decoder Attention
11
Decoding numbers into words
12
Decoding the second token
13
Extra stuff you can add to a Transformer
Description:
Dive into a comprehensive 36-minute video explanation of Transformer Neural Networks, the foundation of cutting-edge AI technologies like ChatGPT and Google Translate. Learn about word embedding, positional encoding, self-attention mechanisms, and the encoder-decoder architecture. Explore how Transformers are designed for parallel computing and understand the decoding process. Gain insights into additional components that can enhance Transformer performance. Supplementary links are provided for deeper understanding of related concepts such as backpropagation, SoftMax function, and cosine similarity.

Transformer Neural Networks, ChatGPT's Foundation, Clearly Explained

StatQuest with Josh Starmer
Add to list
0:00 / 0:00