Explore a groundbreaking approach to generating high-performance tensor programs for deep learning in this 20-minute conference talk from OSDI '20. Dive into Ansor, a novel framework that revolutionizes the optimization of tensor programs across various hardware platforms. Learn how Ansor's hierarchical search space representation, evolutionary search, and learned cost model outperform existing strategies, leading to significant performance improvements for deep neural networks on Intel CPUs, ARM CPUs, and NVIDIA GPUs. Discover the challenges in current deep learning systems and how Ansor's innovative task scheduler simultaneously optimizes multiple subgraphs. Follow the presentation's structure, covering topics such as the deep learning system stack, compiler approaches, program sampling techniques, and ablation studies, to gain a comprehensive understanding of this cutting-edge technology that pushes the boundaries of efficient deep learning execution.
Ansor - Generating High-Performance Tensor Programs for Deep Learning