Optimizing Intranode, GPU-to-GPU Communication in MPI

GPU accelerators are gaining popularity in a high performance computing systems. Though they can bring significant gains in performance and power efficiency, GPUs introduce a distinct "device" memory that must be managed by the programmer. When the programmer wishes to communicate data to or from the GPU using MPI, they must currently perform explicit movement of data between host and device memory, leading to inefficient utilization of resources and additional data movement operations.

In this project, we address the efficiency of communicating data from GPU to GPU within the same node using MPI. We address this problem through two techniques: (1) using host-side shared memory to eliminate extra memory copies, and (2) enabling GPU context sharing across processes for direct device-to-device communication. Results indicate that the the use of host-side shared memory can yield up to 2.5x speedup for large messages. GPU context sharing is not currently supported by CUDA, however we have made significant progress toward enabling this highly efficient technique. I will report on the performance potential of context sharing as well as the technical hurdles we have overcome in this ongoing effort.

In addition, if time permits, I will discuss my ongoing work at NCSU: building a software distributed shared memory system for CPU-GPU heterogeneous systems. In this project, we seek to address the productivity and performance challenges of maintaining distinct CPU and GPU memories through the use of a software-based shared memory model.

BIO:

Feng Ji is starting his fourth year as a PhD candidate in the department of computer science of North Carolina State University, under the guidance of Dr. Xiaosong Ma. He is interested in systems research for parallel architectures, especially GPU-enabled heterogeneous systems. In the past, he got his bachelor and master degrees from Zhejiang University, Hangzhou, China. With an expected graduation in 2013, he is hoping to find a systems research position in the future.

Argonne Leadership Computing Facility

Leadership Computing Resources

Featured: Aurora

Computational Science

Featured: Engineering

Growing the HPC Community

Accelerating Science

Support Center

Featured: Get Started

Featured: MyALCF

08/10/2011, 5:30am CT