Play all

How to quantize a large language model

: Why quantize a language model

What is quantization

Which quantization to use?

GGUF vs BNB vs AWQ vs GPTQ

How to quantize with AWQ

How to quantize with GGUF GGML

Recap

Description:

Learn how to quantize large language models using GGUF or AWQ in this 26-minute video tutorial. Explore the reasons for quantization, understand different quantization methods, and compare GGUF, BNB, AWQ, and GPTQ techniques. Follow step-by-step instructions for quantizing models with AWQ and GGUF (GGML), and gain access to advanced fine-tuning resources, including scripts for unsupervised and supervised fine-tuning, dataset preparation, and embedding creation. Discover valuable resources such as presentation slides, GitHub repositories, and related research papers to enhance your understanding of LLM quantization techniques.

How to Quantize a Large Language Model with GGUF or AWQ

Trelis Research

Add to list

#Computer Science #Machine Learning #Quantization #Model Compression #Hugging Face #llama.cpp #GGUF

0:00 / 0:00