Comparison of Virtualization and Containerization Techniques for High-Performance Computing

Balaji Subramaniam
Seminar

High Performance Computing (HPC) users have traditionally used dedicated clusters hosted in national laboratories and universities to run their scientific applications. Recently, the use of cloud computing for such scientific applications has become popular, as exemplified by Amazon providing HPC instances. Nonetheless, HPC users have approached cloud computing cautiously due to various reasons. First, the instances in the cloud are virtualized. Such virtualization comes with an associated performance overhead. Second, virtual instances in the cloud are co-located to improve the overall utilization of the cluster and co-location leads to prohibitive performance variability. In spite of these concerns, cloud computing is gaining popularity in HPC due to high availability, lower queue wait times and flexibility to support different types of applications (including legacy application support).

Recent improvements and developments in virtualization and containerization have alleviated some of these concerns regarding performance. There are new container-related technologies such as Linux containers (LXC) and Docker which aim to deliver near bare-metal performance while continuing to provide some of the benefits of a virtualized instance. The applicability of such technologies to HPC applications has not yet been thoroughly explored. Moreover, kernel-based virtual machine (KVM) has several tunable parameters which can be explored to improve its applicability to HPC environments. Furthermore, scalability of scientific applications in the context of virtualized or containerized environments is not well studied.

We seek to understand the applicability of virtualization (exemplified by KVM) and containerization (exemplified by Docker) technologies to HPC applications.