Play all

Intro

Birds-eye view of RL

Illustrative application: RL in personal health

General thrust

Direction: Exploiting structure in RL

Vignette: Q-learning with low rank structure

Vignette: Model-free versus model-based method

Estimate dynamics or value functions for LQR? - Linear state space model with quadratic reward function

Performance of LSTD versus model-based metho

Direction: Exploration/exploitation beyond bandi

Vignette: Q-learning with UCB

Vignette: UCB and Monte Carlo Tree Search

Direction: From worst-case to instance-optimalit

Vignette: Instance-optimality of TD learning?

Instance-optimality in policy evaluation

Direction: RL in offline settings and causal inferen

Some future directions exploiting methods from cal inferences instrumental variables propensity score, doubly robust methods, synthetic controls

Description:

Explore reinforcement learning in this 31-minute lecture by Martin Wainwright from UC Berkeley, presented at the Foundations of Data Science Institute Kickoff Workshop. Gain a bird's-eye view of RL and its application in personal health. Delve into exploiting structure in RL, including Q-learning with low rank structure and comparing model-free versus model-based methods. Examine the performance of LSTD versus model-based methods in linear state space models with quadratic reward functions. Investigate exploration/exploitation beyond bandits, including Q-learning with UCB and Monte Carlo Tree Search. Consider the concept of instance-optimality in RL, particularly in policy evaluation. Finally, explore RL in offline settings and its connections to causal inference, touching on instrumental variables, propensity scores, doubly robust methods, and synthetic controls.

Reinforcement Learning

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Data Science #Mathematics #Statistics & Probability #Causal Inference #Q-learning #Offline Reinforcement Learning