Exploring Hybrid Parallelization through Computational Solid Mechanics

Piotr Fidkowski
Seminar

Abstract:
Heterogeneous clusters that add hardware accelerators to nodes are becoming more popular, due to trends in hardware as well as constraints in computing towards the exascale. Three of the top ten computers in the Top500 list use hardware accelerators, including the current #2 supercomputer. Achieving optimal performance on such machines requires the use of hybrid parallelization, combining shared memory at the nodal level and message passing in between nodes.

In this talk, I will motivate the need for hybrid parallelization and address some of the missing pieces in accelerator computing. I will present the results of a case study in hybrid parallelization using an MPI based unstructured mesh, explicit, finite element solver for computational solid mechanics. The domain decomposition used for unstructured FEM provides a natural hierarchy for hardware acceleration at the node level. A performance improvement of 20-30x has been demonstrated using a hybrid CUDA+MPI approach. Issues of data management will be discussed, as well as possible extensions to MPI to facilitate accelerator computing.

Bio:
Piotr Fidkowski is a masters student in aerospace engineering at the Massachusetts Institute of Technology, working under Professor Raul Radovitzky. His research interests are in the field of computational solid mechanics, specifically in discontinuous Galerkin methods and GPGPU computing.