Name:
Malith Jayaweera
Title:
Energy-Aware Transformations for Affine Programs on GPUs
Date:
8/6/2024
Time:
10:00:00 AM
Committee Members:
Prof. David Kaeli (Co-advisor)
Prof. Yanzhi Wang (Co-advisor)
Dr. Norman Rubin
Prof. Martin Kong (Ohio State University)
Abstract:
Graphics Processing Units (GPUs) have been increasingly used to accelerate workloads ranging from high performance computing to machine learning. Development of high-level programming languages, improved compilers, and runtime drivers have helped to accelerate the widespread adoption of GPUs. Given the wider adoption and ever-increasing computing capabilities, the power consumption of GPUs is quickly becoming a critical factor. Furthermore, the GPU micro-architecture differs from vendor to vendor, and even between hardware generations of the same vendor. Also, program variants with similar performance could differ in energy consumption due to the difference in utilization of GPU resources such as Streaming Multiprocessors (SMs) or memory. Despite performance improvements in compilation techniques, energy-aware code generation for heterogeneous GPUs has not been aggressively explored.
In this dissertation, we first identify the potential for energy-aware compilation techniques for GPUs. Next, we use these insights to study loop tiling, which is a popular loop transformation that has been successfully applied to computational domains such as linear algebra, deep neural networks and iterative stencils. We then propose an energy-aware tile size selection for affine programs to generate energy-efficient code targeting GPUs.
We also investigate the challenging problem of optimizing the scheduling of complex sparse tensor algebra and expressions on GPUs, with a focus on maximizing parallelism utilization to unlock optimal performance. We perform a comprehensive examination of the search space for sparse tensor expression scheduling, seeking to characterize the intricate inter-relationships between kernel characteristics, GPU architecture, and hardware constraints such as memory bandwidth limitations, to inform optimal scheduling decisions.