Collective I/O: State of The Art and Future Developments

Event Sponsor: 
Mathematics and Computer Science Division Seminar
Start Date: 
Apr 17 2017 - 10:30am
Building/Room: 
Building 240/Room 1404-1405
Location: 
Argonne National Laboratory
Speaker(s): 
Giuseppe Congiu
Host: 
Pavan Balaji

Abstract:
Collective I/O is a parallel I/O technique designed to deliver high performance data access to scientific applications running on large scale computing clusters. Collective I/O has been used for many years by the HPC community and has had a number of improvements since its conception in the early 90’s. The goal of this presentation is to outline the basic idea of collective I/O and describe the most relevant improvements that have been proposed and integrated over the years and that allowed I/O performance to scale to current large scale clusters.
In the first part of the presentation the basic idea of collective I/O is presented. Afterwards, the main flaws in the design are discussed. Subsequently some of the solutions proposed are presented. Finally, an alternative solution to boost collective I/O performance that makes use of non-volatile memory devices is introduced and discussed. A detailed description of how the proposed solution is integrated in the ROMIO middleware as well as corresponding performance measurements are also discussed.
 
Biography:
Giuseppe Congiu holds a Bachelor Degree (2005) and a Master Degree (2008) in Electric and Electronic Engineering from University of Cagliari. From January 2009 to August 2010 he has worked as software developer on an Italian government funded project to build an AID medical imaging system. In February 2011 he joined Xyratex as PhD fellow in the SCALUS (SCALing by mean of Ubiquitous Storage) project, a Marie Curie Initial Training Network (MCITN) program funded by the European Commission. In December 2013 he started working on the DEEP-ER (Dynamic Exascale Entry Platform - Extended Reach) project, an Exascale research project funded by the EC. Currently he is working on other two EC funded Exascale research projects: the CoE (Centre of Excellence) ESiWACE and the FETHPC (Future Emerging Technology for HPC) SAGE. In 2017 he is expected to get his PhD degree from the University of Mainz (GE).