Sequence Labeling Given an input text X, predict an output label sequence of equal length
4
Reminder: Bi-RNNS - Simple and standard model for sequence labeling for classification
5
Issues w/ Simple BiRNN
6
Alternative: Bag of n-grams
7
Unknown Words
8
Sub-word Segmentation
9
Unsupervised Subword Segmentation Algorithms
10
Sub-word Based Embeddings
11
Sub-word Based Embedding Models
12
Embeddings for Cross-lingual Learning: Soft Decoupled Encoding
13
Labeled/Unlabeled Data Problem: we have very little labeled data for most analysis tasks for most languages
14
Joint Multi-task Learning
15
Pre-training
16
Masked Language Modeling
17
Thinking about Multi-tasking, and Pre-trained Representations
18
Other Monolingual BERTS
19
XTREME: Comparing Multilingual Representations
20
Why Call it "Structured" Prediction?
21
Why Model Interactions in Output?
22
Local Normalization vs. Global Normalization
23
Potential Functions
24
Discussion
Description:
Explore advanced methods for text classification and sequence labeling in this 50-minute video lecture from CMU's Multilingual Natural Language Processing course. Delve into subword models, unsupervised training, and structured prediction models. Learn about bi-directional RNNs, bag of n-grams, and solutions for unknown words. Discover subword segmentation algorithms and embedding models, including cross-lingual learning techniques. Examine strategies for handling limited labeled data, such as joint multi-task learning and pre-training with masked language modeling. Compare multilingual representations and understand the importance of structured prediction in NLP tasks. Gain insights into local vs. global normalization and potential functions in advanced text classification and labeling techniques.
CMU Multilingual NLP 2020 - Advanced Text Classification-Labeling