Главная
Study mode:
on
1
Intro
2
Computation Capacity vs DNN Model Size
3
Sparsity Commonly Exists
4
Evolving of Sparsity Pattern
5
Obstacles of Sparsity Optimization
6
The Myth of Proxy Metrics
7
Across-Stack Innovations in Silos
8
SparTA: An End-to-End Approach to Model Sparsity
9
Core Abstraction: TeSA
10
System Architecture
11
Execution Transformation
12
Code Specialization
13
What SparTA Achieves
14
Evaluation on Various Patterns & Models
15
End-to-end Opportunity
16
Mixed Sparsity Evaluation
17
Real Latency for Algorithm
18
Conclusion
Description:
Explore an innovative approach to deep learning model sparsity in this 16-minute conference talk from OSDI '22. Learn about Tensor-with-Sparsity-Attribute (TeSA), a new abstraction that augments the default Tensor abstraction for dense models. Discover how TeSA enables sparsity attributes and patterns to be specified, propagated, and utilized across entire deep learning models, resulting in highly efficient, specialized operators. Understand the SparTA framework's ability to accommodate various sparsity patterns and optimization techniques, delivering significant speedups in inference latency compared to state-of-the-art solutions. Gain insights into the evolution of sparsity patterns, obstacles in sparsity optimization, and the importance of end-to-end model sparsity approaches. Examine the framework's architecture, execution transformation, and code specialization techniques, as well as its performance across various patterns and models.

SparTA - Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute

USENIX
Add to list
0:00 / 0:00