Play all

Intro

Opportunities and Challenges Decision-making

Learning (Data-driven decision-making) is a promis

Control of Networked Markov Decision Process

Examples of Systems with the local interact

Scalable RL for Network Systems

Review: Policy Gradient in the Full Information C

RL in the Network Setting

The Exponential Decay Property

Truncation of Q-function

Numerical results: Multi-Access Wireless Communic

Other (Multiagent) Learning Settings Decentralized Control

Optimality Guarantee

Optimization Landscape

Gradient play for identical interest case

General Stochastic Games

Convergence of gradient play?

Summary

Description:

Explore the intricacies of decentralized policies in multiagent systems through this comprehensive lecture by Na Li from Harvard University. Delve into the Scalable Actor Critic (SAC) framework, which leverages network structures to find local, decentralized policies approximating global objectives. Examine the performance of stationary points in scenarios where states are shared among agents but actions follow decentralized policies. Investigate the use of stochastic game frameworks to characterize policy gradient performance in multiagent Markov Decision Process systems. Learn about opportunities and challenges in decision-making, data-driven approaches, and control of networked Markov Decision Processes. Discover numerical results in multi-access wireless communication and explore various multiagent learning settings, including decentralized control, optimality guarantees, and optimization landscapes. Gain insights into gradient play for identical interest cases and general stochastic games, concluding with a discussion on convergence and a comprehensive summary of the presented concepts. Read more

Learning Decentralized Policies in Multiagent Systems - How to Learn Efficiently

Simons Institute

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Markov Decision Processes #Social Sciences #Economics #Game Theory #Nash Equilibrium #Policy Gradient

0:00 / 0:00