Play all

Intro

Sequential Decision Making Under Uncertainty

Learning to Make Good Sequences of Decisions Under Uncertainty → 1980s Reinforcement Learning

Background: Markov Decision Process Value Function

Background: Reinforcement Learning

Counterfactual / Batch Off Policy Reinforcement Learning

Need for Generalization

Growing Interest in Causal Inference & ML

Batch / Counterfactual Policy Optimization: Pick Policy w/Best Estimated Expected Sum of Rewards

Quest: Batch Policy Optimization w/ Generalization Bounds

Challenge: Good Error Bound Analysis

Aim: Strong Generalization Guarantees on Policy Performance, Alternative: Guarantee Find Good in Class Policy

Off-Policy Policy Gradient with State Distribution Correction

Aim: Strong Generalization Guarantees on Policy Performance, Alternative: Guarantee Find Best in Class Policy

Example: Linear Thresholding Policies Starting HIV treatment as soon as

Use an Advantage Decomposition

Use a Doubly Robust Advantage Decomposition

Quest for Batch Policy Optimization with Generalization Guarantees

Techniques to Minimize & Understand Data Needed to Learn to Make Good Decisions

Description:

Explore sequential decision-making under uncertainty in this 41-minute lecture by Emma Brunskill from Stanford University. Delve into the emerging challenges of deep learning, focusing on counterfactual and batch reinforcement learning. Learn about Markov Decision Processes, value functions, and the growing interest in causal inference and machine learning. Examine batch policy optimization with generalization bounds, off-policy policy gradient techniques, and methods to minimize data needed for effective decision-making. Gain insights into linear thresholding policies, advantage decomposition, and doubly robust advantage decomposition in the context of real-world applications like HIV treatment initiation.

Better Learning from the Past - Counterfactual - Batch RL

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Mathematics #Statistics & Probability #Causal Inference #Artificial Intelligence #Sequential Decision Making