Главная
Study mode:
on
1
Intro
2
Problems of Distributed Training
3
Reduce Transfer Data Size Recall the workflow of Parameter-Server Based Distributed Training
4
Limitations of Sparse Communication
5
Optimizers with Momentum Repeat, update weights
6
Deep Gradient Compression
7
Comparison of Gradient Pruning Method
8
Latency Bottleneck
9
High Network Latency Slows Federated Lea
10
Conventional Algorithms Suffer from High La Vanilla Distributed Synchronous SGD
11
Delayed Gradient Averaging
12
DGA Accuracy Evaluation
13
Real-world Benchmark
14
Summary of Today's Lecture
15
References
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore distributed training challenges and solutions in this lecture from MIT's 6.S965 course. Dive into communication bottlenecks like bandwidth and latency, and learn about gradient compression techniques including gradient pruning and quantization. Discover how delayed gradient averaging can mitigate latency issues in distributed training. Gain insights into efficient machine learning methods for deploying neural networks on resource-constrained devices. Examine topics such as model compression, neural architecture search, and on-device transfer learning. Apply these concepts to optimize deep learning applications for videos, point clouds, and NLP tasks. Access accompanying slides and resources to enhance your understanding of efficient deep learning computing and TinyML.

Distributed Training and Gradient Compression - Lecture 14

MIT HAN Lab
Add to list