Explore population-based methods for single- and multi-agent reinforcement learning in this informative 51-minute lecture presented by Roy Fox from UCI at the USC Information Sciences Institute. Delve into ensemble methods for reinforcement learning, focusing on the MeanQ algorithm and its ability to reduce estimation variance without a stabilizing target network. Examine the curious theoretical equivalence of MeanQ to a non-ensemble method and its superior performance. Investigate double-oracle methods in adversarial environments, introducing the XDO algorithm that exploits sequential game structure to reduce worst-case population size. Learn about the speaker's research interests in reinforcement learning, algorithmic game theory, information theory, and robotics. Cover topics including RL basics, Deep Q-learning, ensemble RL methods, variance reduction techniques, and extensive-form double oracle algorithms. Gain insights into the latest advancements in single- and multi-agent reinforcement learning through this comprehensive seminar.
Read more
Population-Based Methods for Single- and Multi-Agent Reinforcement Learning - Lecture