Play all

Intro

Sentence Representations

Basic Idea (Bahdanau et al. 2015)

Calculating Attention (1)

A Graphical Example

Attention Score Functions (2)

Input Sentence

Previously Generated Things

Various Modalities

Hierarchical Structures (Yang et al. 2016)

Multiple Sources

Intra-Attention / Self Attention (Cheng et al. 2016) • Each element in the sentence attends to other elements + context sensitive encodings!

Coverage

Incorporating Markov Properties (Cohn et al. 2015)

Bidirectional Training (Cohn et al. 2015)

Supervised Training (Mi et al. 2016)

Attention is not Alignment! (Koehn and Knowles 2017) • Attention is often blurred

Monotonic Attention (e.g. Yu et al. 2016)

Convolutional Attention (Allamanis et al. 2016)

Multi-headed Attention

Summary of the "Transformer" (Vaswani et al. 2017)

Attention Tricks

Description:

Explore a comprehensive lecture on attention mechanisms in neural networks for natural language processing. Delve into the fundamentals of attention, including what to attend to, improvements to attention, and specialized attention varieties. Examine a case study on the "Attention is All You Need" paper. Learn about various attention score functions, hierarchical structures, multiple sources, and intra-attention. Discover advanced concepts such as coverage, bidirectional training, supervised training, and monotonic attention. Investigate convolutional attention and multi-headed attention, concluding with a summary of the "Transformer" model and useful attention tricks. Access accompanying slides and code examples to enhance your understanding of these crucial NLP concepts.

Neural Nets for NLP 2017 - Attention

Graham Neubig

Add to list

#Computer Science #Artificial Intelligence #Neural Networks #Natural Language Processing (NLP) #Deep Learning #Attention Mechanisms #Machine Learning #Transformer Models #Self-Attention

0:00 / 0:00