Главная
Study mode:
on
1
Welcome to the Al Seminar Series
2
Reinforcement Learning (RL)
3
RL basics
4
Deep Q-learning (DQN)
5
Why use target network?
6
Why reduce estimation variance
7
Ensemble RL methods
8
Ensemble RL for variance reduction
9
MeanQ design choices
10
Combining with existing techniques
11
Experiment results (100K interaction steps)
12
Obviating the target network
13
Comparing model size and update rate
14
MeanQ: variance reduction
15
Loss of ensemble diversity
16
Linear function approximation
17
Diversity through independent sampling
18
Ongoing investigation
19
Takeaways
20
Fictitious Play
21
What to do in large dynamical environments
22
PSRO convergence properties
23
Extensive-Form Double Oracle (XDO)
24
XDO: results
25
XDO convergence properties
Description:
Explore population-based methods for single- and multi-agent reinforcement learning in this informative 51-minute lecture presented by Roy Fox from UCI at the USC Information Sciences Institute. Delve into ensemble methods for reinforcement learning, focusing on the MeanQ algorithm and its ability to reduce estimation variance without a stabilizing target network. Examine the curious theoretical equivalence of MeanQ to a non-ensemble method and its superior performance. Investigate double-oracle methods in adversarial environments, introducing the XDO algorithm that exploits sequential game structure to reduce worst-case population size. Learn about the speaker's research interests in reinforcement learning, algorithmic game theory, information theory, and robotics. Cover topics including RL basics, Deep Q-learning, ensemble RL methods, variance reduction techniques, and extensive-form double oracle algorithms. Gain insights into the latest advancements in single- and multi-agent reinforcement learning through this comprehensive seminar. Read more

Population-Based Methods for Single- and Multi-Agent Reinforcement Learning - Lecture

USC Information Sciences Institute
Add to list