Play all

Intro

Reinforcement Learning

Policy Evaluation (PE)

Main Results

Notation

Objective Function for PE

Outline

Challenge with MSPBE

MSPBE as Saddle-Point Problem

Primal Dual Batch Gradient for Low

Stochastic Gradient Descent for L(0,w)

Stochastic Variance Reduced Gradient (SVRG)

SAGA

Extensions

Complexity: Summary

Preliminary Experiments

Experiments: Benchmarks

Random MDPS

Mountain Car

Previous Work

Conclusions

Description:

Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.

Stochastic Variance Reduction Methods for Policy Evaluation

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Algorithms #Optimization Algorithms #Stochastic Gradient Descent