Главная
Study mode:
on
1
Intro -
2
"Bigger is Better" -
3
The Problem -
4
Model Compression -
5
1 Quantization -
6
2 Pruning -
7
3 Knowledge Distillation -
8
Example: Compressing a model with KD + Quantization -
Description:
Explore three methods for compressing Large Language Models (LLMs) - Quantization, Pruning, and Knowledge Distillation/Model Distillation - with accompanying Python code examples. Learn about the challenges of model size and the benefits of compression techniques. Follow along with a practical demonstration of combining Knowledge Distillation and Quantization to compress a BERT-based phishing classifier model. Access additional resources including a blog post, GitHub repository, pre-trained models, and dataset for further exploration of LLM compression techniques.

Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Shaw Talebi
Add to list
0:00 / 0:00