Главная
Study mode:
on
1
Intro
2
Reinforcement Learning
3
Policy Evaluation (PE)
4
Main Results
5
Notation
6
Objective Function for PE
7
Outline
8
Challenge with MSPBE
9
MSPBE as Saddle-Point Problem
10
Primal Dual Batch Gradient for Low
11
Stochastic Gradient Descent for L(0,w)
12
Stochastic Variance Reduced Gradient (SVRG)
13
SAGA
14
Extensions
15
Complexity: Summary
16
Preliminary Experiments
17
Experiments: Benchmarks
18
Random MDPS
19
Mountain Car
20
Previous Work
21
Conclusions
Description:
Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.

Stochastic Variance Reduction Methods for Policy Evaluation

Simons Institute
Add to list