Главная
Study mode:
on
1
Intro
2
Tabular Markov decision process
3
Prior efforts: algorithms and sample complexity results
4
Minimax optimal sample complexity of tabular MDP
5
Adding some structure: state feature map
6
Representing value function using linear combination of features
7
Rethinking Bellman equation
8
Reducing Bellman equation using features
9
Sample complexity of RL with features
10
Of-Policy Policy Evaluation (OPE)
11
OPE with function approximation
12
Equivalence to plug-in estimation
13
Minimax-optimal batch policy evaluation
14
Lower Bound Analysis
15
Episodic Reinforcement Learning
16
Feature space embedding of transition kernel
17
Regret Analysis
18
Exploration with Value-Targeted Regression VTAL
Description:
Explore the statistical complexities of reinforcement learning in this 47-minute lecture by Mengdi Wang from Princeton University. Delve into key theoretical questions surrounding RL, including sample complexity, regret analysis, and off-policy evaluation. Examine recent findings on minimax-optimal sample complexities for solving Markov Decision Processes, optimal off-policy evaluation through regression, and regret bounds for online RL with nonparametric model estimation. Gain insights into tabular MDPs, state feature mapping, Bellman equation reduction, and episodic reinforcement learning. Understand the importance of feature space embedding of transition kernels and exploration techniques like Value-Targeted Regression. This talk, part of the Intersections between Control, Learning and Optimization 2020 series at the Institute for Pure & Applied Mathematics, offers a comprehensive overview of recent advancements in the theoretical foundations of reinforcement learning.

On the Statistical Complexity of Reinforcement Learning

Institute for Pure & Applied Mathematics (IPAM)
Add to list
0:00 / 0:00