Play all

Intro

Acknowledgements

Reinforcement Learning (RL)

Challenges of Real-World RL

Goals and Preferences

Linear Temporal Logic (LTL) A compelling logic to express temporal properties of traces.

Challenges to RL

Toy Problem Disclaimer

Running Example

Decoupling Transition and Reward Functions

The Rest of the Talk

Define a Reward Function using a Reward Machine

Reward Function Vocabulary

Simple Reward Machine

Reward Machines in Action

Other Reward Machines

Q-Learning Baseline

Option-Based Hierarchical RL (HRL)

HRL with RM-Based Pruning (HRL-RM)

HRL Methods Can Find Suboptimal Policies

Q-Learning for Reward Machines (QRM)

QRM In Action

Recall: Methods for Exploiting RM Structure

5. QRM + Reward Shaping (QRM + RS)

Test Domains

Test in Discrete Domains

Office World Experiments

Minecraft World Experiments

Function Approximation with QRM

Water World Experiments

Creating Reward Machines

Reward Specification: one size does not fit all

1. Construct Reward Machine from Formal Languages

Generate RM using a Symbolic Planner

Learn RMs for Partially-Observable RL

Description:

Explore formal languages and automata for reward function specification and efficient reinforcement learning in this comprehensive lecture by Sheila McIlraith from the University of Toronto. Delve into the challenges of real-world reinforcement learning, focusing on goals and preferences expression. Examine Linear Temporal Logic (LTL) as a compelling method for expressing temporal properties of traces. Discover the concept of reward machines and their application in defining reward functions. Compare various reinforcement learning methods, including Q-Learning, Option-Based Hierarchical RL, and Q-Learning for Reward Machines (QRM). Analyze experimental results from discrete domains, Office World, Minecraft World, and Water World. Investigate techniques for creating reward machines, including construction from formal languages and generation using symbolic planners. Gain insights into reward specification and its application in partially-observable reinforcement learning environments.

Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning

Simons Institute

Add to list

#Computer Science #Automata Theory #Artificial Intelligence #Machine Learning #Reinforcement Learning #Theoretical Computer Science #Formal Languages #Q-learning

0:00 / 0:00