Explore a conference talk from USENIX ATC '20 that introduces offload annotations (OAs), a novel approach to enable heterogeneous GPU computing in existing workloads with minimal code modifications. Learn how OAs allow developers to leverage specialized hardware accelerators like GPUs through high-level APIs, addressing challenges in kernel library development and adoption. Discover how an annotator marks CPU library types and functions with equivalent kernel library functions, providing an offloading API for input/output partitioning and device memory management. Understand the runtime's role in mapping CPU functions to GPU kernels and managing execution, data transfers, and paging. Examine the significant performance improvements achieved in data science workloads using CPU libraries like NumPy and Pandas, with speedups of up to 1200x and a median of 6.3x through transparent GPU offloading. Gain insights into OAs' ability to match handwritten heterogeneous implementations and automatically handle data paging for datasets larger than GPU memory.
Read more
Offload Annotations - Bringing Heterogeneous Computing to Existing Libraries and Workloads