Play all

Intro

Reinforcement Learning (RL) Applications

Value-function Approximation

Comparison between SL and RL

Markov Decision Process (MDP)

Batch learning in MDPS

Example: Video game playing

Batch learning in large MDPS

Assumption on data (?)

Assumption on data & MDP dynamics

Algorithm for batch RL

How things go wrong (w/ restricted class)

Fix using a strong assumption ("completeness")

Realizability alone is insufficient?

Proving the conjecture: Attempt 1

Checklist for a plausible construction

Importance of the conjecture

Importance of the construction

Description:

Explore the complexities of Reinforcement Learning with value-function approximation in this 54-minute lecture by Nan Jiang from the University of Illinois Urbana-Champaign. Delve into the applications of RL, compare it with Supervised Learning, and understand the intricacies of Markov Decision Processes. Examine batch learning in MDPs, including examples from video game playing, and analyze the assumptions on data and MDP dynamics. Investigate algorithms for batch RL and learn how things can go wrong with restricted classes. Discover the importance of strong assumptions like "completeness" and why realizability alone may be insufficient. Follow attempts to prove a key conjecture and grasp its significance in the field. This talk, part of the "Emerging Challenges in Deep Learning" series at the Simons Institute, offers valuable insights into the hardness of RL with value-function approximation.

On the Hardness of Reinforcement Learning With Value-Function Approximation

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Deep Learning #Markov Decision Processes

0:00 / 0:00