Play all

Intro

RL beyond simulated environments?

Tuning the Swiss Free Electron Laser [with Kirschner, Muty, Hiller, Ischebeck et al.]

Challenge: Safety Constraints

Safe optimization

Safe Bayesian optimization

Illustration of Gaussian Process Inference [cf, Rasmussen & Williams 2006]

Plausible maximizers

Certifying Safety

Confidence intervals for GPS?

Online tuning of 24 parameters

Shortcomings of Safe Opt

Safe learning for dynamical systems Koller, Berkenkamp, Turchetta, K CDC 18, 19

Stylized task

Planning with confidence bounds Koller, Berkenkamp, Turchetta, K CDC 18, 19

Forwards-propagating uncertain, nonlinear dynamics

Challenges with long-term action dependencies

Safe learning-based MPC

Experimental illustration

Scaling up: Efficient Optimistic Exploration in Deep Model based Reinforcement Learning

Optimism in Model-based Deep RL

Deep Model-based RL with Confidence: H-UCRL [Curi, Berkenkamp, K, Neurips 20]

Illustration on Inverted Pendulum

Deep RL: Mujoco Half-Cheetah

Action penalty effect

What about safety?

Safety-Gym Benchmark Suite

Which priors to choose? → PAC-Bayesian Meta Learning [Rothfuss, Fortuin, Josifoski, K, ICML 2021]

Experiments - Predictive accuracy (Regression)

Meta-Learned Priors for Bayesian Optimization

Meta-Learned Priors for Sequential Decision Making

Safe and efficient exploration in real-world RL

Acknowledgments

Description:

Explore safe and efficient reinforcement learning techniques in this one-hour seminar from the Machine Learning Advances and Applications series. Delve into the challenges of applying RL beyond simulated environments, focusing on safety constraints and tuning real-world systems like the Swiss Free Electron Laser. Learn about safe Bayesian optimization, Gaussian Process Inference, and confidence intervals for certifying safety. Discover methods for safe learning in dynamical systems, including planning with confidence bounds and forwards-propagating uncertain, nonlinear dynamics. Examine scaling up efficient optimistic exploration in deep model-based RL, with illustrations on inverted pendulum and Mujoco Half-Cheetah environments. Investigate PAC-Bayesian Meta Learning for choosing priors and its applications in Bayesian optimization and sequential decision making. Gain insights into safe and efficient exploration techniques applicable to real-world reinforcement learning scenarios.

Safe and Efficient Exploration in Reinforcement Learning

Fields Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Bayesian Optimization