Главная
Study mode:
on
1
Intro
2
Problem 1: Exposure Bias
3
Problem 2: Disregard to Evaluation Metrics
4
Error
5
Problem: Argmax is Non- differentiable
6
Sampling for Risk
7
Adding Temperature
8
What is Reinforcement Learning?
9
Why Reinforcement Learning in NLP?
10
Supervised MLE
11
Self Training
12
Policy Gradient/REINFORCE
13
Credit Assignment for Rewards
14
Problems w/ Reinforcement Learning
15
Adding a Baseline
16
Calculating Baselines
17
Increasing Batch Size
18
Warm-start
19
When to Use Reinforcement Learning?
20
Action-Value Function
21
Estimating Value Functions
Description:
Explore minimum risk training and reinforcement learning in natural language processing through this comprehensive lecture from CMU's Neural Networks for NLP course. Delve into concepts such as error and risk minimization, policy gradient methods, REINFORCE algorithm, and value-based reinforcement learning. Learn about techniques for stabilizing reinforcement learning, including adding baselines and increasing batch sizes. Understand the applications and challenges of reinforcement learning in NLP tasks, and gain insights into when to use these approaches effectively. Discover methods for estimating value functions and addressing problems like exposure bias and disregard for evaluation metrics in neural language models.

Neural Nets for NLP - Minimum Risk Training and Reinforcement Learning

Graham Neubig
Add to list