Using CODES/TraceR for Application Simulations on HPC Networks

Misbah Mubarak, ANL Postdoc
Nikhil Jain, LLNL Postdoc
Seminar

Design space exploration and procurement process for next-generation high performance computing systems (HPC) is often guided by the expected performance and cost tradeoffs offered by various alternative options. The increasing complexity of today’s HPC architectures negatively impacts the prediction accuracy of simple models, and thus necessitates the need for detailed simulations. In this tutorial, we focus on discrete-event simulation of networks, which is a major component of HPC systems, and discuss factors that impact simulation of realistic scenarios.

We will introduce the CODES/TraceR simulation framework, which has been developed to facilitate studies of application performance on current and future networks. We will present the capabilities of this framework and describe how it can be used to mimic real-world scenarios. In particular, we will discuss how production applications and their multi-job workloads with customized job placement schemes can be simulated with minimal effort. The tutorial will also touch upon the installation process, usage guidelines, and brief notes on community software used by the framework. Finally, we will present case studies from recent work that illustrate how the CODES/TraceR framework can be used for conducting interesting design space explorations and procurement studies.