Explore the benefits of quantization in PyTorch with Suraj Subramanian, an ML engineer and developer advocate at Meta AI, in this 28-minute video. Learn how to make AI models lighter, more power-efficient, and faster by rounding FP32 parameters to integers without sacrificing accuracy. Discover various quantization techniques in PyTorch and understand the workflow for implementation. The video covers key topics such as the need for efficient AI, quantization basics, and future developments in the field. Gain insights from Subramanian's extensive experience in deep learning across personal finance, healthcare research, and behavioral finance sectors.
Leaner and Greener AI with Quantization in PyTorch - Suraj Subramanian