Investigating the Root Causes of I/O Interference in HPC Storage Systems

Orcun Yildiz
Seminar

As we move towards Exascale era, performance variability in the HPC systems still remains as a challenge. I/O interference is one of the major causes of this performance variability.  Many works try to mitigate I/O interference by starting with an assumption of a root cause for the interference and trying to optimize or eliminate this root cause. The root causes of I/O interference can be very diverse however. In this work, we conduct an extensive experimental campaign to explore these diverse root causes of I/O interference in HPC storage systems. We use micro-benchmarks on the Grid’5000 testbed to evaluate how the applications’ access pattern, the network components, the file system’s configuration and the backend storage devices influence I/O interference. In this talk, we present the results of our investigation together with the lessons learned from these experiments which we hope that will enable a better understanding of the I/O interference phenomenon across all components of the I/O stack.

Bio:
Orcun Yildiz is a first-year PhD student from KerData team at Inria Rennes-Bretagne Atlantique (France). His research interests include distributed systems, high performance computing and energy efficient big data management. He received his BSc degree in computer science from Bogazici University, Turkey. He graduated with a double master degree in Distributed Computing (EMDC) from Royal Institute of Technology (KTH), Sweden and Instituto Superior Tecnico (IST), Portugal.