Play all

Intro

Model-Based Reinforcement Learning

Episodic Reinforcement Learning

Upper Confidence Model-Based RL (UCRL)

The class of deterministic continuous systems . Consider a deterministic system

A Simple Metric-Based RL Algorithm

Doubling Dimension d

Feature space embedding of transition model

The MatrixRL Algorithm

From Feature to Kernel Embedding of Transition Model

A motivating example: MuZero

Assumption of Value-Targeted Regression

Value-Targeted Regression (VTR) for Confidence Set Construction

Full Algorithm of UCRL-VTR

Regret analysis of UCRL-VTR

A Special Case

Description:

Explore model-based reinforcement learning with value-targeted regression in this 36-minute lecture by Mengdi Wang from Princeton University. Delve into episodic reinforcement learning, upper confidence model-based RL (UCRL), and deterministic continuous systems. Examine the MatrixRL algorithm, feature space embedding of transition models, and kernel embedding techniques. Investigate the motivating example of MuZero and the assumptions behind value-targeted regression. Learn about confidence set construction using VTR and analyze the regret of UCRL-VTR. Gain insights into the mathematics of online decision-making and advanced reinforcement learning concepts.

Model-Based Reinforcement Learning with Value-Targeted Regression

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Model Based Reinforcement Learning

0:00 / 0:00