Play all

Welcome to the Al Seminar Series

Reinforcement Learning (RL)

RL basics

Deep Q-learning (DQN)

Why use target network?

Why reduce estimation variance

Ensemble RL methods

Ensemble RL for variance reduction

MeanQ design choices

Combining with existing techniques

Experiment results (100K interaction steps)

Obviating the target network

Comparing model size and update rate

MeanQ: variance reduction

Loss of ensemble diversity

Linear function approximation

Diversity through independent sampling

Ongoing investigation

Takeaways

Fictitious Play

What to do in large dynamical environments

PSRO convergence properties

Extensive-Form Double Oracle (XDO)

XDO: results

XDO convergence properties

Description:

Explore population-based methods for single- and multi-agent reinforcement learning in this informative 51-minute lecture presented by Roy Fox from UCI at the USC Information Sciences Institute. Delve into ensemble methods for reinforcement learning, focusing on the MeanQ algorithm and its ability to reduce estimation variance without a stabilizing target network. Examine the curious theoretical equivalence of MeanQ to a non-ensemble method and its superior performance. Investigate double-oracle methods in adversarial environments, introducing the XDO algorithm that exploits sequential game structure to reduce worst-case population size. Learn about the speaker's research interests in reinforcement learning, algorithmic game theory, information theory, and robotics. Cover topics including RL basics, Deep Q-learning, ensemble RL methods, variance reduction techniques, and extensive-form double oracle algorithms. Gain insights into the latest advancements in single- and multi-agent reinforcement learning through this comprehensive seminar. Read more

Population-Based Methods for Single- and Multi-Agent Reinforcement Learning - Lecture

USC Information Sciences Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Ensemble Methods #Deep Q-Learning #Artificial Intelligence #Multi-Agent Systems