Главная
Study mode:
on
1
Intro
2
DNN Training Pipeline
3
Overhead of Data Augmentation
4
Existing Approach: Data Echoing
5
Our Approach: Data Refurbishing
6
Analysis on Sample Diversity
7
Standard Training
8
Challenge: Inconsistent Batch Time
9
PyTorch Dataloader
10
Revamper
11
Balanced Eviction
12
Cache-Aware Shuffle
13
Implementation
14
Evaluation: Environments
15
Evaluation: Baselines
16
Evaluation: Accuracy & Throughput
17
Conclusion
Description:
Explore a 15-minute conference talk from USENIX ATC '21 that introduces data refurbishing, a novel sample reuse mechanism to accelerate deep neural network training while preserving model generalization. Learn how this technique splits data augmentation into partial and final stages, reusing partially augmented samples to reduce CPU computation while maintaining sample diversity. Discover the design and implementation of Revamper, a new data loading system that maximizes overlap between CPU and deep learning accelerators. Examine the evaluation results showing how Revamper can accelerate training of computer vision models by 1.03×–2.04× while maintaining comparable accuracy. Gain insights into the DNN training pipeline, challenges of data augmentation, and innovative solutions for improving training efficiency.

Refurbish Your Training Data - Reusing Partially Augmented Samples for Faster Deep Neural Network Training

USENIX
Add to list