Play all

Intro

Sentence Representations

Calculating Attention (1)

A Graphical Example

Attention Score Functions (1)

Attention Score Functions (2)

Multi-headed Attention

Attention Tricks

Summary of the Transformer

Training Tricks

Masking for Training

Incorporating Markov Properties

Coverage

Input Sentence: Copy

Dictionary Probabilities

Previously Generated Things

Various Modalities

Multiple Sources

Description:

Learn about attention mechanisms in neural networks for natural language processing in this comprehensive lecture from CMU's Neural Networks for NLP course. Explore the "Attention is All You Need" paper, improvements to attention techniques, specialized attention varieties, and what neural networks actually attend to. Dive into topics like sentence representations, attention score functions, multi-headed attention, training tricks, and applications to various modalities. Gain insights on incorporating Markov properties, coverage, dictionary probabilities, and handling multiple sources in attention-based models.

Neural Nets for NLP 2021 - Attention

Graham Neubig

Add to list

#Computer Science #Artificial Intelligence #Neural Networks #Natural Language Processing (NLP) #Deep Learning #Attention Mechanisms #Transformer Architecture

0:00 / 0:00