Play all

Intro

Legacy of Reinforcement Learning to Benefit People

Techniques to Minimize & Understand Data Needed to Learn to Make Good Decisions

Challenge: Covariate Shift Different Policies-- Different Actions - Different State Distributions

Quest: Batch Policy Optimization w/ Generalization Bounds

Recall: Importance Sampling for RL Batch Policy Evaluation

1st Proof of Convergence to a Local Optima for Batch Policy Gradient

Experiment Settings

HIV treatment simulator

Aim: Strong Generalization Guarantees on Policy Performance, Alternative: Guarantee Find Best in Class Policy

Example: Linear Thresholding Policies

An Advantage Decomposition

Advantage Doubly Robust (ADR) Estimator

Quest for Batch Policy Optimization with Generalization Guarantees

Description:

Explore a comprehensive lecture on batch and counterfactual reinforcement learning presented by Emma Brunskill from Stanford University at the 2019 ADSI Summer Workshop on Algorithmic Foundations of Learning and Control. Delve into advanced techniques for minimizing and understanding data requirements in decision-making processes, addressing challenges like covariate shift, and examining batch policy optimization with generalization bounds. Investigate the legacy of reinforcement learning in benefiting people, analyze importance sampling for RL batch policy evaluation, and discover the first proof of convergence to local optima for batch policy gradient. Examine experimental settings using HIV treatment simulators, explore strong generalization guarantees on policy performance, and study linear thresholding policies. Gain insights into advantage decomposition and the Advantage Doubly Robust (ADR) Estimator while pursuing the quest for batch policy optimization with generalization guarantees. Read more

ADSI Summer Workshop: Algorithmic Foundations of Learning and Control - Emma Brunskill

Paul G. Allen School

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Importance Sampling