Главная
Study mode:
on
1
Introduction
2
Transformer at a high level
3
Why Batch Data? Why Fixed Length Sequence?
4
Embeddings
5
Positional Encodings
6
Query, Key and Value vectors
7
Masked Multi Head Self Attention
8
Residual Connections
9
Layer Normalization
10
Decoder
11
Masked Multi Head Cross Attention
12
13
Tokenization & Generating the next translated word
14
Transformer Inference Example
Description:
Dive deep into the Transformer Neural Network Architecture for language translation in this comprehensive 28-minute video. Explore key concepts including batch data processing, fixed-length sequences, embeddings, positional encodings, query/key/value vectors, masked multi-head self-attention, residual connections, layer normalization, decoder architecture, cross-attention mechanisms, tokenization, and word generation. Gain practical insights through a Transformer inference example and access additional resources for further learning on neural networks, machine learning, and related mathematical concepts.

The Complete Guide to Transformer Neural Networks

CodeEmporium
Add to list