Play all

Intro

Sentence Representations

Calculating Attention (1)

A Graphical Example

Attention Score Functions (1)

Attention Score Functions (2)

Input Sentence: Copy

Input Sentence: Bias . If you have a translation dictionary, use it to bias outputs (Arthur et al. 2016)

Previously Generated Things

Various Modalities

Multiple Sources

Coverage

Incorporating Markov Properties (Cohn et al. 2015)

Supervised Training (Mi et al. 2016)

Hard Attention

Multi-headed Attention

Attention Tricks

Training Tricks

Description:

Explore attention mechanisms in neural networks for natural language processing in this comprehensive lecture from CMU's Neural Networks for NLP course. Delve into various aspects of attention, including what to attend to, improvements to attention techniques, and specialized attention varieties. Examine a case study on the "Attention is All You Need" paper, and learn about attention score functions, input sentence handling, multi-headed attention, and training tricks. Gain insights into incorporating Markov properties, supervised training for attention, and hard attention concepts. This in-depth presentation covers essential topics for understanding and implementing attention in NLP models.

Neural Nets for NLP 2020 - Attention

Graham Neubig

Add to list

#Computer Science #Artificial Intelligence #Neural Networks #Machine Learning #Natural Language Processing (NLP) #Deep Learning #Attention Mechanisms

0:00 / 0:00