Play all

Intro

What is Reinforcement Learning?

Why Reinforcement Learning in NLP?

Supervised Learning

Self Training

Policy Gradient/REINFORCE

Credit Assignment for Rewards

Problems w/ Reinforcement Learning

Adding a Baseline

Calculating Baselines

Increasing Batch Size

When to Use Reinforcement Learning?

Policy-based vs. Value-based

Action-Value Function . Given a states we try to estimate the value of each action a

Estimating Value Functions

Exploration vs. Exploitation

RL in Dialog

RL for Information Retrieval

Description:

Explore reinforcement learning concepts and applications in natural language processing through this comprehensive lecture. Delve into the fundamentals of reinforcement learning, policy gradient methods, and the REINFORCE algorithm. Learn techniques for stabilizing reinforcement learning processes and understand value-based approaches. Discover the differences between policy-based and value-based methods, and examine the role of action-value functions in estimating optimal actions. Investigate the challenges of credit assignment for rewards and strategies to overcome them, such as adding baselines and increasing batch sizes. Gain insights into when to apply reinforcement learning in NLP tasks, including dialogue systems and information retrieval. Address the exploration vs. exploitation dilemma and its implications for model performance.

Neural Nets for NLP 2019 - Reinforcement Learning

Graham Neubig

Add to list

#Computer Science #Artificial Intelligence #Neural Networks #Natural Language Processing (NLP) #Machine Learning #Reinforcement Learning #Policy Gradient