Play all

Intro

Today's Outline

System and architecture are the foundation

Properties of Distributed Systems

MIT EECS 6.824 Distributed Systems

Updating Model Parameters

Synchronous Update versus Asynchronous Update

Decentralized Asynchronous Stochastic Gradient Descend

Parallelism in Distributed ML Systems

Hogwild: Lock-free asynchronous SGD

Implementation of Hogwild (asych SGD) in PyTorch

Case Study: MapReduce

Case Study: DisBelief

Fun facts about Jeff Dean

Case Study: AlexNet

Diagram of Reinforcement Learning

Development of Distributed RL Systems

2013: Deep Q Network

2015: General Reinforcement Learning Architecture (GORILA)

Review on Actor-Critic Methods

A3C: Asynchronous Advantage Actor Critic (ABC)

Comparison to Variants of DQN and GORILA

Sample code for A3C

Why Asynchronism works in A3C?

Comparison of A3C and A2C

Sample code for A2C

2018: Apex-X (Distributed Prioritized Experience Replay)

2018: IMPALA (Importance Weighted Actor- Learner Architecture)

2018 RLlib: abstraction for distributed RL

Some Other Parallelizable Algorithms: (Revisited) Evolution Strategies

Case Study: Al for Modern Games

System Design for AlphaGo Zero

System Design for AlphaStar

Conclusion

Description:

Explore the ninth lecture in a course on reinforcement learning, focusing on distributed systems in RL. Delve into the foundations of system architecture, properties of distributed systems, and various approaches to updating model parameters. Examine case studies including MapReduce, DisBeliefF, and AlexNet, and learn about the development of distributed RL systems from Deep Q Network to modern implementations like A3C and IMPALA. Investigate parallelizable algorithms, system designs for AI in modern games like AlphaGo Zero and AlphaStar, and gain insights into the evolution of distributed reinforcement learning architectures.

Introduction to Reinforcement Learning - Distributed RL Systems - Lecture 9

Bolei Zhou

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Distributed Systems #Distributed Computing #MapReduce #Software Engineering #System Architecture #Deep Q Networks

0:00 / 0:00