Uncertainties in Big Data Visualization: Theory, Scalability, and Design

Hanqi Guo
Seminar

Data analysis and visualization in many domains—materials science, climate, weather, fluid dynamics, system fault tolerance—must incorporate data uncertainties.  We must have new theories with high scalability and innovative designs to tackle big data with uncertainties.  In this talk, I am going to highlight three directions we have been studying.

On uncertainty visualization theory, we have been working to define features in uncertain data and to quantify uncertainties in deterministic analyses.  I will exemplify these two aspects  with new definitions of separatrices for uncertain flows in weather simulations and uncertainty quantification of vortices in superconductor simulations, respectively.

On uncertainty visualization scalability, I will present studies on scalable flow analysis: sparse flow data management, coupled ensemble flow visualization, decoupled uncertain flow analysis, higher order prefetching, dynamic load balancing with k-d trees, parallel partial reduction, flow trajectory compression, and in situ feature tracking.  These techniques are tested and used on U.S. and Chinese supercomputers with different architectures. Our most recent study of uncertain flow analysis can scale up to 256K Blue Gene/Q cores on Mira and 128K Operon cores and 8K GPUs on Titan.  

On uncertainty visualization design, I am going to briefly introduce my research on user interface designs that can be used for uncertain data exploration.  I will demonstrate La VALSE—a visual analysis tool that enables the exploration of tens of millions of noisy RAS (reliability, availability, and serviceability) logs—for system resilience researchers and system administrators to study Mira.  

Bio:
Dr. Hanqi Guo is a Postdoctoral Appointee in the Mathematics and Computer Science Division.  He received his Ph.D. Degree in Computer Science from Peking University in 2014, and his B.S. Degree in Mathematics and Applied Mathematics from Beijing University of Posts and Telecommunications in 2009.  His research interests are mainly in the visualization of large-scale scientific data that involve uncertainties.  He has published more than 20 papers in premiere visualization venues, including IEEE TVCG, IEEE VIS, and IEEE PacificVis.  He also serves as Program Committee Member of IEEE VIS, IEEE PacificVis, SIGGRAPH Asia Symposium on Visualization, and Eurographics Symposium on Parallel Graphics and Visualization (EGPGV).