Aurora software development: Compiler readiness

Aurora Software Development: Compiler Readiness

In this series, we examine the range of activities and collaborations that ALCF staff undertake to guide the facility and its users into the next era of scientific computing.

Preparing for Aurora

For Thomas Applencourt and Colleen Bertoni, computational scientists at the the U.S. Department of Energy's (DOE) Office of Science's Argonne Leadership Computing Facility (ALCF), their dedication to their work, bolstered by close collaborations with colleagues on Intel's compiler team, helps ensure that the exascale Intel and HPE Aurora system is ready for Day One production science.

Thomas Applencourt and Colleen Bertoni

ALCF researchers Thomas Applencourt (left) and Colleen Bertoni are collaborating with Intel to ensure the Aurora compiler software is performant and ready for science in advance of the exascale system's arrival. (Image: Argonne National Laboratory)

The Intel team is developing a compiler to enable applications on Intel’s high-performance Xe processor, the GPU that will provide the brunt of Aurora’s computational power. As part of Argonne’s close collaboration with Intel to evaluate the compiler’s functionality and performance on Exascale Computing Project (ECP) and Early Science Project (ESP) applications, Bertoni and Applencourt’s efforts ensure that the robustness and performance of the software—essential for allowing code to run on the hardware—meet the ALCF’s expectations.

This means the compiler must be highly performant and capable of delivering production science at Aurora’s launch.

When Bertoni and Applencourt began their work, the compilers were in something a “pre-alpha” state—that is, very little formal testing had been performed on the software, which, buggy and missing features, was still only functional in a somewhat rudimentary sense.

This is the first time that Intel has implemented discrete GPU support in any of its C, C++, or Fortran compilers. It is producing the components as open-source code, thereby enabling the development of open-source compiler projects like LLVM.

Methods and tools

Based on the needs of Exascale Computing Project (ECP) and Early Science Project (ESP) teams utilizing Argonne computational resources, Bertoni and Applencourt isolate and triage bugs and feature requests and, coordinating with Intel, ensure they’re fixed and effected in a timely fashion. The compiler itself is tested on a daily basis, with a full array of standard language benchmarks assessed.

Collaborations with Intel primarily are primarily driven by phone and email exchanges in which present issues and upcoming features are discussed, culminating in monthly calls for the escalation of the highest-priority bugs.

Bertoni and Applencourt also collaborate with the ECP and ESP teams to help test applications, find bugs, and identify reproducers for quality assurance testing.

Partnerships with the National Energy Research Scientific Center (NERSC) and the Oak Ridge Leadership Computing Facility (OLCF), meanwhile, help achieve consensuses on key OpenMP directives for GPU compilers.

Results so far

Bertoni and Applencourt’s efforts have spurred the development of code for math library function calls common to Argonne work, and led to Intel prioritizing implementation of the GPU version of those functions.

They are currently tracking twenty applications and mini-apps—including QMCPACK, MILC, PHASTA, BerkeleyGW, WEST, and GAMESS—for testing, and tracking 250 reproducers based on bugs they’ve reported.

Bertoni and Applencourt also devote time to optimizing user codes for the Aurora system, and are always eager to benchmark new applications. Relatedly, Applencourt has developed an extensive new conformance test, OvO, targeting the Aurora-supported OpenMP programming model. A collection of OpenMP offloading test functions for C++ and Fortran, OvO focuses on extensive testing of hierarchical parallelism and mathematical functions. The hierarchical parallelism tests generate three types of kernels, while mathematical-function tests determine if all functions of a specified standard can be offloaded.