Best Practices for Queueing and Running Jobs on Theta
This session focuses on effective scheduling and configuration of jobs on the ALCF's Theta supercomputer to improve user experience. After a brief overview of the Cobalt scheduler, we describe best practices for writing batch scripts, working interactively, such as with Jupyter notebooks, and considerations that affect queue turnaround time. We present example scripts for common simulation, data, and learning workloads, such as utilizing SSDs for local storage, using Singularity to run containerized jobs, and launching ensemble runs. We also show how high-throughput workloads can leverage Balsam to launch many applications per Cobalt job.
About the Speakers
Chris Knight is catalyst team lead. His research interests include the advancement of molecular simulations to model soft condensed matter using both classical and ab initio methods and understanding scientific application performance on future computational resources.
Adrian Pope is an assistant computational scientist. He is working at the intersection of data, computing, and statistical methods for cosmological inference.
Misha Salim is an assistant computational scientist. His research interests include data science and Balsam workflow management.
About the Series
The ALCF Developer Sessions webinar series was created to foster discussion between the software and hardware developers and the early users of that technology.