The ALCF Getting Started Bootcamp will introduce attendees to using the Polaris computing environment. Aimed at participants who have experience using clusters or supercomputers, the bootcamp will cover the PBS job scheduler, utilizing preinstalled environments, proper compiler and profiler use, Python environments, and running Jupyter notebooks. The goal is to inform those in attendance where these tools are located and which ones to use.
The 2nd segment of this webinar is focused on NVIDIA Developer Tools. These tools are available for detailed performance analysis of HPC applications running on NVIDIA A100 on Polaris. Nsight Systems provides developers a system-wide visualization of an application's performance. Developers can optimize bottlenecks to scale efficiently across any number or size of CPUs and GPUs on Polaris. Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging via a user interface and command line tool. In this session, several use cases of Nsight Systems and Nsight Compute will be presented via a demo with simple HPC benchmarks on Polaris.
All the tutorials are available on our GitHub repository and this bootcamp will be a live demonstration which users can run at the same time and ask questions or get help when things fail unexpectedly. Please note, to follow along with the speakers you need to have an existing account for ALCF Polaris system.
JaeHyuk Kwack is a member of the performance engineering group at Argonne Leadership Computing Facility (ALCF). He is a lead of performance tools for ALCF computing resources, and he is responsible for ensuring the readiness of a number of major scientific applications for performant use on the U.S. Department of Energy’s (DOE) forthcoming Aurora exascale system.
Taylor Childers has a Ph.D. in Physics from Univ. of Minnesota. He worked at the CERN laboratory in Geneva, Switzerland for six years as a member of the ATLAS experiment and a co-author of the Higgs Boson discovery paper in July 2012. He has worked in physics analysis, workflows, and simulation from scaling on DOE supercomputers to fast custom electronics (ASIC/FPGA). He applies deep learning to science domain problems, including using Graph Neural Networks to perform semantic segmentation to associate each of the 100 million pixels of the ATLAS detector to particles originating from the proton collisions. He is currently working with scientists from different domains to apply deep learning to their datasets and take advantage of Exascale supercomputers arriving in the next few years.