Главная
Study mode:
on
1
Introduction
2
Agenda
3
What is Pruning
4
Classification
5
Observations
6
Timeline
7
Optimal Brain Damage Framework
8
OBS Framework
9
Efficient SecondOrder Approximation
10
Results
11
Natural Language Processing
12
Deep Space Deployment
13
Next Step Up Results
14
Intuition
15
Weight Update
16
Open Source
17
QA Session
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore the cutting-edge world of second-order pruning algorithms for state-of-the-art model compression in this 42-minute video presentation by Eldar Kurtić, Research Consultant at Neural Magic. Dive into the research, production results, and intuition behind these powerful techniques that enable higher sparsity while maintaining accuracy. Learn how to achieve significant model size reduction, lower latency, and higher throughput by removing weights that least affect the loss function. Discover real-world examples, such as pruning a ResNet-50 image classification model by 95% while retaining 99% of its baseline accuracy, resulting in a dramatic file size reduction from 90.8MB to 9.3MB. Follow along as the speaker guides you through the Optimal Brain Damage Framework, OBS Framework, and Efficient Second-Order Approximation. Gain insights into applications in Classification, Natural Language Processing, and Deep Space Deployment. Understand the intuition behind weight updates and explore open-source implementations. Conclude with a Q&A session to address any questions about applying these advanced pruning algorithms to your machine learning projects. Read more

Applying Second-Order Pruning Algorithms for SOTA Model Compression

Neural Magic
Add to list