Phase II Amount
$1,098,712
As HPC systems continue to increase in heterogeneity performance portability becomes a universal challenge for application developers who wish to extract the performance that can come from machine specific features. Across academia, government and industry, multiple template metaprogramming techniques have been widely deployed to assist with the ion of the varying hardware elements found on todays systems. These metaprogramming techniques have been proven effective at building high performance applications and libraries across a wide range of scientific and engineering fields. However, the inherent ion layers present in these programming techniques presents a challenge for users attempting to efficiently analyze application performance. The goal of the META project is to construct a malleable performance analysis infrastructure that provides users the ability to analyze the parallel execution and memory performance of a variety of template metaprogramming language constructs that is directly attached to the existing LLVM compiler tool chains widely used in high performance computing. The analysis infrastructure provided by META derives and depicts performance data from parallel applications at compile time across a variety of template metaprogramming constructs via command line and graphical interfaces. Through the Phase I effort, we successfully developed and demonstrated the underlying META infrastructure that provides performance and memory analysis methods agnostic of the overall template metaprogramming technique. This infrastructure utilizes the compilers existing syntax tree interfaces to parse and construct a directed acyclic graph representation of the target parallel programs execution and memory patterns. From this, we have the ability to execute a series of passes that derive and present performance related data at compile time. We demonstrated META by analyzing the performance of a variety of high performance computing applications written using the Kokkos metaprogramming approach. Phase II of the META project will build and expand upon the infrastructure developed in Phase I. First, through the Phase II effort, we will expand upon the initial set of analysis passes to include more complex transforms and investigative procedures for parallel applications and memory layouts. We will also productize the initial support for Kokkos as well as provide support for the Raja metaprogramming construct. Finally, we will investigate the ability to automatically transform parallel applications using the data provided by the analysis passes in order to automate the construction of high performance, parallel algorithms. Performance portability is a cross-cutting problem as developers need to port code across multiple platforms - be they different HPC centers, heterogeneous embedded devices or cloud-computing providers. Providing a vendor- agnostic performance analysis infrastructure tied to an existing open source compiler tool chain has the ability to dramatically improve the ability for users to create high performance parallel applications and decreasing the time to efficient application solutions.