Главная
Study mode:
on
1
Intro
2
Acknowledgements
3
Reinforcement Learning (RL)
4
Challenges of Real-World RL
5
Goals and Preferences
6
Linear Temporal Logic (LTL) A compelling logic to express temporal properties of traces.
7
Challenges to RL
8
Toy Problem Disclaimer
9
Running Example
10
Decoupling Transition and Reward Functions
11
The Rest of the Talk
12
Define a Reward Function using a Reward Machine
13
Reward Function Vocabulary
14
Simple Reward Machine
15
Reward Machines in Action
16
Other Reward Machines
17
Q-Learning Baseline
18
Option-Based Hierarchical RL (HRL)
19
HRL with RM-Based Pruning (HRL-RM)
20
HRL Methods Can Find Suboptimal Policies
21
Q-Learning for Reward Machines (QRM)
22
QRM In Action
23
Recall: Methods for Exploiting RM Structure
24
5. QRM + Reward Shaping (QRM + RS)
25
Test Domains
26
Test in Discrete Domains
27
Office World Experiments
28
Minecraft World Experiments
29
Function Approximation with QRM
30
Water World Experiments
31
Creating Reward Machines
32
Reward Specification: one size does not fit all
33
1. Construct Reward Machine from Formal Languages
34
Generate RM using a Symbolic Planner
35
Learn RMs for Partially-Observable RL
Description:
Explore formal languages and automata for reward function specification and efficient reinforcement learning in this comprehensive lecture by Sheila McIlraith from the University of Toronto. Delve into the challenges of real-world reinforcement learning, focusing on goals and preferences expression. Examine Linear Temporal Logic (LTL) as a compelling method for expressing temporal properties of traces. Discover the concept of reward machines and their application in defining reward functions. Compare various reinforcement learning methods, including Q-Learning, Option-Based Hierarchical RL, and Q-Learning for Reward Machines (QRM). Analyze experimental results from discrete domains, Office World, Minecraft World, and Water World. Investigate techniques for creating reward machines, including construction from formal languages and generation using symbolic planners. Gain insights into reward specification and its application in partially-observable reinforcement learning environments.

Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Simons Institute
Add to list
0:00 / 0:00