Главная
Study mode:
on
1
Introduction
2
What are language models
3
Modern NLP
4
Scaling
5
sparse models
6
Gshard
7
Base Layers
8
Formal Optimization
9
Algorithmic Optimization
10
Experiments
11
Comparison
12
Benefits
13
Dmxlayers
14
Representations
15
Simple routing
16
Training time
17
Parallel training
18
Data curation
19
Unrealistic setting
20
Domain structure
21
Inference procedure
22
Perplexity numbers
23
Modularity
24
Remove experts
25
Summary
26
Generic language models
27
Hot dog example
28
Hot pan example
29
Common sense example
30
Large language models
31
The fundamental challenge
32
Surface form competition
33
Flip the reasoning
34
Key intuition
35
Noisey channel models
36
Finetuning
37
Scoring Strings
38
Web Crawls
39
Example Output
40
Structure Data
41
Efficiency
42
Questions
43
Density estimation
44
Better training objectives
45
Optimization
46
Probability
47
Induction
48
multimodality
49
outliers
50
compute vs data
Description:
Explore the future of large language models in this seminar by Luke Zettlemoyer at MIT. Delve into the challenges and possibilities of scaling language models, including sparse mixtures of experts (MoEs) models with reduced cross-node communication costs. Learn about innovative prompting techniques that control for surface form variation, improving performance without extensive task-specific fine-tuning. Discover new forms of supervision for language model training, such as learning from hypertext and multi-modal web page structures. Gain insights into the potential next generation of NLP models, covering topics like modern NLP scaling, algorithmic optimization, parallel training, domain structure, and inference procedures. Examine the benefits and challenges of modular approaches, perplexity numbers, and the fundamental challenges of generic language models. Investigate the role of noisy channel models, fine-tuning, and scoring strings in improving model performance. Consider the impact of web crawls, structured data efficiency, and multimodality on the future of language models. Read more

Large Language Models - Will They Keep Getting Bigger?

Massachusetts Institute of Technology
Add to list
0:00 / 0:00