Play all

Intro

Language Models • Language models are generative models of text

Conditioned Language Models

Calculating the Probability of a Sentence

Conditional Language Models

One Type of Language Model Mikolov et al. 2011

How to Pass Hidden State?

The Generation Problem

Ancestral Sampling

Greedy Search

Beam Search

Sentence Representations

Calculating Attention (1)

A Graphical Example

Attention Score Functions (1)

Attention is not Alignment! (Koehn and Knowles 2017)

Coverage

Multi-headed Attention

Supervised Training (Liu et al. 2016)

Self Attention (Cheng et al. 2016) • Each element in the sentence attends to other

Why Self Attention?

Transformer Attention Tricks

Transformer Training Tricks

Masking for Training . We want to perform training in as few operations as possible using big matrix multiplies

A Unified View of Sequence- to-sequence Models

Code Walk

Description:

Explore machine translation and sequence-to-sequence models in this 44-minute lecture from CMU's Multilingual Natural Language Processing course. Delve into conditional language modeling, simple sequence-to-sequence models, generation methods, attention mechanisms, and self-attention/transformers. Learn about calculating sentence probabilities, hidden state passing, and various generation techniques including ancestral sampling, greedy search, and beam search. Examine sentence representations, attention score functions, and the distinction between attention and alignment. Discover multi-headed attention, supervised training approaches, and the intricacies of self-attention in Transformer models. Gain insights into Transformer training tricks, masking techniques for efficient training, and a unified view of sequence-to-sequence models. Conclude with a code walkthrough to solidify understanding of these advanced NLP concepts.

CMU Multilingual NLP - Machine Translation-Sequence-to-Sequence Models

Graham Neubig

Add to list

#Computer Science #Artificial Intelligence #Natural Language Processing (NLP) #Deep Learning #Machine Learning #Transformers #Attention Mechanisms #Sequence to Sequence Models #Self-Attention

0:00 / 0:00