Play all

Intro -

"Bigger is Better" -

The Problem -

Model Compression -

1 Quantization -

2 Pruning -

3 Knowledge Distillation -

Example: Compressing a model with KD + Quantization -

Description:

Explore three methods for compressing Large Language Models (LLMs) - Quantization, Pruning, and Knowledge Distillation/Model Distillation - with accompanying Python code examples. Learn about the challenges of model size and the benefits of compression techniques. Follow along with a practical demonstration of combining Knowledge Distillation and Quantization to compress a BERT-based phishing classifier model. Access additional resources including a blog post, GitHub repository, pre-trained models, and dataset for further exploration of LLM compression techniques.

Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Shaw Talebi

Add to list

#Computer Science #Machine Learning #Model Compression #Programming #Programming Languages #Python #Artificial Intelligence #Natural Language Processing (NLP) #LLM (Large Language Model) #BERT #Quantization #Hugging Face

0:00 / 0:00