GLEAN: Scalable In Situ Analysis and I/O Acceleration on Leadership Computing Systems

GLEAN is  a flexible and extensible framework that takes application, analysis, and system characteristics into account to facilitate simulation-time data analysis and I/O acceleration. The GLEAN infrastructure hides significant details from the end user, while at the same time providing a flexible intterface to the fastest path for their data and analysis needs and, in the end, scientific insight. It provides an infrastructure for accelerating I/O, interfacing to running simulations for co-analysis, and/or an interface for in situ analysis with zero or minimal modifications to the existing application code base. Nonintrusive integration is achieved by seamlessly embedding GLEAN in higher-level I/O libraries such as pnetcdf and hdf5.

Primary Contact: 

Venkat Vishwanath, venkat@anl.gov

Publications: 

V. Vishwanath, M. Hereld, V. Morozov, and M. E. Papka, "Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems", In Proceedings of the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2011), Seattle, USA, November 2011.

V. Vishwanath, H. Bui, M. Hereld, M. Papka, “GLEAN”, Book Chapter in “High Performance Parallel I/O”, CRC Press, Taylor and Francis Group, November 2014.

H. Bui, V. Vishwanath, H. Finkel, K. Harms, J. Leigh, S. Habib, K. Heitmann and M. E. Papka “Scalable parallel I/O on Blue Gene/Q supercomputer using compression, topology-aware data aggregation, and subfiling”, In the Proceedings of the 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP 2014), Turin, Italy, February 2014

S. Habib, V. Morozov, N. Frontiere, H. Finkel, A. Pope, K. Heitmann, K. Kumaran, V. Vishwanath, T. Peterka, J. Insley, D. Daniel, P. Fasel, Z. Lukic, “HACC: Extreme Scaling and Performance Across Diverse Architectures", In the Proceedings of the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2013), Denver, Colorado, USA, November 2013

M. Rasquin, P. Marion, V. Vishwanath, B. Matthews, M. Hereld, K. Jansen, R. Loy, A. Bauer, M. Zhou, O. Sahni, J. Fu, N. Liu, C. Carothers, M. Shephard, M. E. Papka, K. Kumaran, B. Geveci,“Co-visualization of full data and in situ data extracts from unstructured grid CFD at 160K cores”, In Proceedings of the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2011), Seattle, USA, November 2011.

V. Vishwanath, M. Hereld, and M. E. Papka, "Simulation-time data analysis and I/O acceleration on leadership-class systems using GLEAN", In Proceedings of the IEEE Symposium on Large Data Analysis and Visualization (LDAV), Providence, RI, USA, October 2011.

 V. Vishwanath, M. Hereld, M. Papka, R. Hudson, G. Jordan and C. Daley, “Towards simulation-time data analysis and I/O acceleration of FLASH astrophysics simulations on leadership-class systems using GLEAN", SCIDAC, Denver, Colorado, 2011.

V. Vishwanath, M. Hereld, K. Iskra, D. Kimpe, V. Morozov, M. Papka, R. Ross, and K. Yoshii, “Accelerating I/O Forwarding in IBM Blue Gene/P Systems", In Proceedings of the IEEE/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (SC 2010), pp. 1--10, November 2010.

Funding: 
ALCF, DOE ASCR