Главная
Study mode:
on
1
Intro
2
Today's Outline
3
System and architecture are the foundation
4
Properties of Distributed Systems
5
MIT EECS 6.824 Distributed Systems
6
Updating Model Parameters
7
Synchronous Update versus Asynchronous Update
8
Decentralized Asynchronous Stochastic Gradient Descend
9
Parallelism in Distributed ML Systems
10
Hogwild: Lock-free asynchronous SGD
11
Implementation of Hogwild (asych SGD) in PyTorch
12
Case Study: MapReduce
13
Case Study: DisBelief
14
Fun facts about Jeff Dean
15
Case Study: AlexNet
16
Diagram of Reinforcement Learning
17
Development of Distributed RL Systems
18
2013: Deep Q Network
19
2015: General Reinforcement Learning Architecture (GORILA)
20
Review on Actor-Critic Methods
21
A3C: Asynchronous Advantage Actor Critic (ABC)
22
Comparison to Variants of DQN and GORILA
23
Sample code for A3C
24
Why Asynchronism works in A3C?
25
Comparison of A3C and A2C
26
Sample code for A2C
27
2018: Apex-X (Distributed Prioritized Experience Replay)
28
2018: IMPALA (Importance Weighted Actor- Learner Architecture)
29
2018 RLlib: abstraction for distributed RL
30
Some Other Parallelizable Algorithms: (Revisited) Evolution Strategies
31
Case Study: Al for Modern Games
32
System Design for AlphaGo Zero
33
System Design for AlphaStar
34
Conclusion
Description:
Explore the ninth lecture in a course on reinforcement learning, focusing on distributed systems in RL. Delve into the foundations of system architecture, properties of distributed systems, and various approaches to updating model parameters. Examine case studies including MapReduce, DisBeliefF, and AlexNet, and learn about the development of distributed RL systems from Deep Q Network to modern implementations like A3C and IMPALA. Investigate parallelizable algorithms, system designs for AI in modern games like AlphaGo Zero and AlphaStar, and gain insights into the evolution of distributed reinforcement learning architectures.

Introduction to Reinforcement Learning - Distributed RL Systems - Lecture 9

Bolei Zhou
Add to list
0:00 / 0:00