Play all

Intro

Problems of Distributed Training

Reduce Transfer Data Size Recall the workflow of Parameter-Server Based Distributed Training

Limitations of Sparse Communication

Optimizers with Momentum Repeat, update weights

Deep Gradient Compression

Comparison of Gradient Pruning Method

Latency Bottleneck

High Network Latency Slows Federated Lea

Conventional Algorithms Suffer from High La Vanilla Distributed Synchronous SGD

Delayed Gradient Averaging

DGA Accuracy Evaluation

Real-world Benchmark

Summary of Today's Lecture

References

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore distributed training challenges and solutions in this lecture from MIT's 6.S965 course. Dive into communication bottlenecks like bandwidth and latency, and learn about gradient compression techniques including gradient pruning and quantization. Discover how delayed gradient averaging can mitigate latency issues in distributed training. Gain insights into efficient machine learning methods for deploying neural networks on resource-constrained devices. Examine topics such as model compression, neural architecture search, and on-device transfer learning. Apply these concepts to optimize deep learning applications for videos, point clouds, and NLP tasks. Access accompanying slides and resources to enhance your understanding of efficient deep learning computing and TinyML.

Distributed Training and Gradient Compression - Lecture 14

MIT HAN Lab

Add to list

#Computer Science #Machine Learning #Distributed Training #Deep Learning #Federated Learning