Главная
Study mode:
on
1
Intro
2
Reinforcement karning: Learning to make decisions
3
Online vs. Offline (Batch) RL: A Basic View
4
Outline
5
Markov Decision Process (MDP)
6
MDP Example: Deterministic Shortest Path
7
More General Case: Bellman Equation
8
Bellman Operator
9
When Bellman Meets Gauss: Approximate DP
10
Divergence Example of Tsitsiklis & Van Roy (96)
11
Does It Matter in Practice?
12
A Long-standing Open Problem
13
Linear Programming Reformulation
14
Why Solving for Fixed Point Directly is Hard?
15
Addressing Difficulty #2: Legendre-Fenchel Transformation
16
Reformulation of Bellman Equation
17
Primal-dual Problems are Hard to Solve
18
A New Loss for Solving Bellman Equation
19
Eigenfunction Interpretation
20
Puddle World with Neural Networks
21
Conclusions
Description:
Explore reinforcement learning through an optimization lens in this 47-minute lecture by Lihong Li from Google Brain. Delve into the fundamentals of reinforcement learning, including Markov Decision Processes, Bellman equations, and the challenges of online versus offline learning. Examine the intersection of Bellman and Gauss in approximate dynamic programming, and investigate a long-standing open problem in the field. Discover how linear programming reformulation and Legendre-Fenchel transformation address difficulties in solving fixed-point problems. Learn about a new loss function for solving Bellman equations and its eigenfunction interpretation. Conclude with practical applications using neural networks in a Puddle World scenario.

Reinforcement Learning via an Optimization Lens

Simons Institute
Add to list
0:00 / 0:00