Enlightenment era neural language models (NLMs) 1. Solve curse of dimensionality by sharing of statistical strength via
4
Recurrent models with (self-)attention
5
Self-attention in masked sequence model
6
SQUAD Question
7
What do BERT attention heads do?
8
There's a coreference head (!)
9
Distance metrics unify trees and vectors
10
Finding trees in vector spaces
Description:
Explore emergent linguistic structures in deep contextual neural word representations with Stanford University's Chris Manning in this 43-minute lecture from the Workshop on Theory of Deep Learning. Delve into language modeling, enlightenment era neural language models, and how they solve the curse of dimensionality. Examine recurrent models with self-attention, masked sequence models, and the SQUAD Question. Discover what BERT attention heads do, including a coreference head, and learn how distance metrics unify trees and vectors. Gain insights into finding trees in vector spaces and advance your understanding of deep learning in natural language processing.
Emergent Linguistic Structure in Deep Contextual Neural Word Representations - Chris Manning