Play all

Intro

Markov decision process

What does a sample mean?

Complexity and Regret for Tabular MDP

Rethinking Bellman equation

State Feature Map

Representing value function using linear combination of features

Reducing Bellman equation using features

Sample complexity of RL with features

Learning to Control On-The-Fly

Episodic Reinforcement Learning

Hilbert space embedding of transition kernel

The MatrixRL Algorithm

Regret Analysis

From feature to kernel

MatrixRL has a equivalent kernelization

Pros and cons for using features for RL

What could be good state features?

Finding Metastable State Clusters

Example: stochastic diffusion process

Unsupervised state aggregation learning

Soft state aggregation for NYC taxi data

Example: State Trajectories of Demon Attack

Description:

Explore reinforcement learning in feature space through this 45-minute lecture by Mengdi Wang from Princeton University. Delve into Markov decision processes, sample complexity, and regret analysis for tabular MDP. Examine the Bellman equation, state feature maps, and value function representation using linear combinations of features. Investigate episodic reinforcement learning, Hilbert space embedding of transition kernels, and the MatrixRL algorithm. Consider the pros and cons of using features for RL and discuss potential good state features. Learn about metastable state clusters, unsupervised state aggregation, and see applications in stochastic diffusion processes and NYC taxi data. Gain insights into emerging challenges in deep learning through this Simons Institute talk.

Reinforcement Learning in Feature Space: Complexity and Regret

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Markov Decision Processes #Algorithms #Complexity #Mathematics #Functional Analysis #Hilbert Spaces #Sample Complexity

0:00 / 0:00