Главная
Study mode:
on
1
Introduction
2
Workshop Overview
3
Presentation
4
Distributional Domain Shift
5
Answer Identification
6
Improved Sketch
7
Summary
8
Discussion
Description:
Explore a deep reinforcement learning presentation on Model-Based Offline Policy Optimization (MOPO) delivered by Tengyu Ma from Stanford University at the Simons Institute. Delve into topics such as distributional domain shift, answer identification, and improved sketch techniques as the speaker discusses innovative approaches to offline reinforcement learning. Gain insights into the challenges and solutions in developing effective policies from pre-collected datasets without direct interaction with the environment.

MOPO - Model-Based Offline Policy Optimization

Simons Institute
Add to list