Play all

Introduction

What are language models

Modern NLP

Scaling

sparse models

Gshard

Base Layers

Formal Optimization

Algorithmic Optimization

Experiments

Comparison

Benefits

Dmxlayers

Representations

Simple routing

Training time

Parallel training

Data curation

Unrealistic setting

Domain structure

Inference procedure

Perplexity numbers

Modularity

Remove experts

Summary

Generic language models

Hot dog example

Hot pan example

Common sense example

Large language models

The fundamental challenge

Surface form competition

Flip the reasoning

Key intuition

Noisey channel models

Finetuning

Scoring Strings

Web Crawls

Example Output

Structure Data

Efficiency

Questions

Density estimation

Better training objectives

Optimization

Probability

Induction

multimodality

outliers

compute vs data

Description:

Explore the future of large language models in this seminar by Luke Zettlemoyer at MIT. Delve into the challenges and possibilities of scaling language models, including sparse mixtures of experts (MoEs) models with reduced cross-node communication costs. Learn about innovative prompting techniques that control for surface form variation, improving performance without extensive task-specific fine-tuning. Discover new forms of supervision for language model training, such as learning from hypertext and multi-modal web page structures. Gain insights into the potential next generation of NLP models, covering topics like modern NLP scaling, algorithmic optimization, parallel training, domain structure, and inference procedures. Examine the benefits and challenges of modular approaches, perplexity numbers, and the fundamental challenges of generic language models. Investigate the role of noisy channel models, fine-tuning, and scoring strings in improving model performance. Consider the impact of web crawls, structured data efficiency, and multimodality on the future of language models. Read more

Large Language Models - Will They Keep Getting Bigger?

Massachusetts Institute of Technology

Add to list

#Computer Science #Artificial Intelligence #Natural Language Processing (NLP) #Machine Learning #Model Optimization #Fine-Tuning

0:00 / 0:00