Best Practices for Queueing and Running Jobs on Theta

Chris Knight, Adrian Pope, and Misha Salim, Argonne National Laboratory
ALCF Developer Sessions: Best Practices for Queueing and Running on Theta

Best Practices for Queueing and Running Jobs on Theta

This session focuses on effective scheduling and configuration of jobs on the ALCF's Theta supercomputer to improve user experience. After a brief overview of the Cobalt scheduler, we describe best practices for writing batch scripts, working interactively, such as with Jupyter notebooks, and considerations that affect queue turnaround time. We present example scripts for common simulation, data, and learning workloads, such as utilizing SSDs for local storage, using Singularity to run containerized jobs, and launching ensemble runs. We also show how high-throughput workloads can leverage Balsam to launch many applications per Cobalt job.

About the Speakers

Chris Knight is catalyst team lead. His research interests include the advancement of molecular simulations to model soft condensed matter using both classical and ab initio methods and understanding scientific application performance on future computational resources.

Adrian Pope is an assistant computational scientist. He is working at the intersection of data, computing, and statistical methods for cosmological inference.

Misha Salim is an assistant computational scientist. His research interests include data science and Balsam workflow management.

About the Series

The ALCF Developer Sessions webinar series was created to foster discussion between the software and hardware developers and the early users of that technology.