A One-sided Communication Protocol for Exascale Storage Systems

Judicael Zounmevo
Seminar

The storage subsystem of supercomputers is growing more slowly than the compute part. For instance, the Blue Gene/P generation of Argonne supercomputers has one I/O node for 64 compute nodes; and the new Blue Gene/Q Mira has one I/O node for 128 compute nodes. This compute to I/O node ratio is expected to get worse with the shift towards the exascale era. To better cope with the potentially overwhelming number of client requests, the exascale-directed Triton storage project puts the servers in control of I/O communications. The I/O servers decide when and how to service requests while the clients remain passive. This communication model, known as one-sided, matches the semantic of the now ubiquitous Remote Direct Memory Access (RDMA) feature which offloads data transfer from the CPU.

In this talk, I present the one-sided communication protocol that was adopted for the Triton storage project. I briefly explain the rationale behind the protocol and show how it fits in a typical triton I/O request lifetime. Then, the focus is put on the design decisions made to provide an implementation over two one-sided libraries.

Bio: Judicael Zounmevo is a Ph.D. candidate in the Electrical and Computer Engineering department of Queen's University in Kingston, ON, Canada. He is interested in optimizing the MPI one-sided communication model, the MPI progress-engine in general and its message queues at large scale. He is supervised by Dr. Ahmad Afsahi. This summer, he worked in the storage group with Dr. Dries Kimpe.