Главная
Study mode:
on
1
Intro
2
Markov decision process
3
What does a sample mean?
4
Complexity and Regret for Tabular MDP
5
Rethinking Bellman equation
6
State Feature Map
7
Representing value function using linear combination of features
8
Reducing Bellman equation using features
9
Sample complexity of RL with features
10
Learning to Control On-The-Fly
11
Episodic Reinforcement Learning
12
Hilbert space embedding of transition kernel
13
The MatrixRL Algorithm
14
Regret Analysis
15
From feature to kernel
16
MatrixRL has a equivalent kernelization
17
Pros and cons for using features for RL
18
What could be good state features?
19
Finding Metastable State Clusters
20
Example: stochastic diffusion process
21
Unsupervised state aggregation learning
22
Soft state aggregation for NYC taxi data
23
Example: State Trajectories of Demon Attack
Description:
Explore reinforcement learning in feature space through this 45-minute lecture by Mengdi Wang from Princeton University. Delve into Markov decision processes, sample complexity, and regret analysis for tabular MDP. Examine the Bellman equation, state feature maps, and value function representation using linear combinations of features. Investigate episodic reinforcement learning, Hilbert space embedding of transition kernels, and the MatrixRL algorithm. Consider the pros and cons of using features for RL and discuss potential good state features. Learn about metastable state clusters, unsupervised state aggregation, and see applications in stochastic diffusion processes and NYC taxi data. Gain insights into emerging challenges in deep learning through this Simons Institute talk.

Reinforcement Learning in Feature Space: Complexity and Regret

Simons Institute
Add to list
0:00 / 0:00