Mira Performance Boot Camp boosts code performance, propels science

science
Boot Camp collaborations boost code performance and propel research

New and seasoned ALCF users convened at Argonne National Laboratory May 20-23 for the annual Mira Performance Boot Camp. For many, the three-day event was a timely opportunity to tap into the expertise of ALCF staff and invited guests to improve their code’s scalability in preparation for a 2015 INCITE proposal submission.

The INCITE program is the primary mechanism through which the majority of available compute hours at the DOE leadership computing facilities are allocated to projects at work on breakthrough science and engineering. Proposals for INCITE awards are due June 27 for allocations available beginning January 1, 2015. In addition to undergoing a rigorous review for scientific merit, to be eligible for an INCITE award, researchers must demonstrate the ability of their code to utilize the massive compute resources of leadership-class systems like those available at the ALCF. The ALCF’s Boot Camp gives researchers assistance towards demonstrating that computational readiness.

The bulk of the three-day event was devoted to hands-on, one-on-one tuning of applications. In addition, ALCF experts spoke on topics of interest, including Blue Gene/Q architecture, ensemble jobs, parallel I/O, and data analysis. Guest tool and debugger developers provided information and individualized assistance to attendees.

Highlighted Accomplishments:

  • NCAR researchers realized a 30 percent speedup for their Community Earth Systems Model and shaved weeks off their development time with the installation of a new parallel driver. The team will see further gains with the creation of a new ensemble run script for their production runs.
  • Collaborative efforts at Boot Camp helped researchers with an INCITE project at work on novel materials design develop a script to automate their workflow and overcome a bottleneck that was preventing effective use of their allocation.
  • A researcher successfully profiled his finite-difference code for integrating Navier-Stokes equations for the simulation of turbulent flows, running on up to 8 racks of Mira, identifying the cause of performance bottlenecks related to file I/O, and implementing a more efficient algorithms for reading input, resulting in a 7x speedup.
  • Working with on-hand industry experts, a group using a commercial CFD/combustion code successfully uncovered a major performance bottleneck: a serial part of their code that runs increasingly slow as the code runs on more nodes for a fixed problem size.

With dedicated Boot Camp reservation queues, attendees had quick, uninterrupted access to ALCF resources, allowing them to run nearly 400 jobs and use over 1.7 million core-hours as they diagnosed code issues and tweaked performance.

Even tried-and-true codes benefited from the intense support at Boot Camp, including NCAR’s Community Earth System Model. Said NCAR researcher Adrianne Middleton, “Our model has already been tuned extensively, so the improvements we made were totally unexpected. The 30 percent speedup means we can get 30 percent more science from our INCITE allocation.”

View the agenda and links to the presentation slides.