Play all

Intro

Sequential Decision Making

Reinforcement Learning

Sample Efficiency

Value-based Algorithms

Exploration

Multi-armed Bandits

Upper Confidence Bound (UCB)

Q-learning with UCB

Beyond Tabular Setting

Linear Function Approximation

A Natural Algorithm

Linear MDP

Related Work

Description:

Explore provably efficient reinforcement learning with linear function approximation in this 28-minute lecture from the Workshop on Theory of Deep Learning. Delve into sequential decision making, sample efficiency, and value-based algorithms as Chi Jin, a Member of the School of Mathematics at the Institute for Advanced Study, presents cutting-edge research. Examine exploration techniques, including multi-armed bandits and Upper Confidence Bound (UCB), before moving beyond tabular settings to linear function approximation. Investigate linear MDPs and related work in this comprehensive overview of reinforcement learning theory and applications.

Provably Efficient Reinforcement Learning with Linear Function Approximation - Chi Jin

Institute for Advanced Study

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Q-learning #Multi-Armed Bandits #Artificial Intelligence #Sequential Decision Making