Tier 1 Science Project
The reach of next-generation cosmological surveys will be determined by our ability to control systematic effects. Currently, there are constraints on cosmological parameters at the few percent level. To reach a more definite conclusion about the cause of the accelerated expansion of the universe, the accuracy of these measurements must be improved by another order of magnitude. To achieve this goal, researchers have to understand the physics of the universe on ever smaller scales. The physics on these scales is much more complicated and accurate treatment of baryonic effects will be crucial.
Today, and in the near future, simulating galaxy formation from first principles will remain an intractable task. Therefore, researchers have to rely on so-called subgrid models that allow them to include a range of astrophysical effects into the simulations, such as supernova feedback, star formation, cooling, or feedback from active galactic nuclei. The modeling approach relies heavily on empirical findings. With this ESP project, the team has two major aims in mind: (1) further the understanding of astrophysical processes by confronting detailed simulations with the newest observations, and (2) with this new understanding, enable reliable modeling of baryonic physics and therefore provide an approach to mitigate possible contamination of cosmological results due to baryonic effects. The researchers will explore a range of well-motivated subgrid models and extract observational signatures. The team will confront their findings with observational data and in this way make progress towards their final goals.
Impact: This project will have a major impact on cosmology and astrophysics. By confronting new observations with sophisticated simulations, the research team will further the understanding of astrophysical processes on small scales. At the same time, they will disentangle these processes from fundamental physics and therefore help mitigate one of the major sources of systematic uncertainties for upcoming cosmological surveys.
As part of this project, the team will carry out a suite of large volume simulations (800h−1Mpc)3 with 2x20483 particles each, leading to a mass resolution of ∼ 5·109h−1M⊙ for the dark matter species and a factor of five lower mass for the baryons. These specifications may change slightly once first test results are in hand.
The HACC framework simulates the mass evolution of the universe using particle-mesh techniques. HACC splits the force calculation into a specially designed grid-based long/medium range spectral particle-mesh (PM) component that is common to all architectures, and an architecture-specific short-range solver. The short-range solvers can use direct particle-particle interactions, i.e., a P3M algorithm, as on Roadrunner or Titan, or use tree methods as on the IBM Blue Gene/Q and Cray XE6 systems (TreePM algorithm). A new hydrodynamic capability is currently being implemented into HACC. The team has developed and tested a new algorithm called Conservative Reproducing Kernel Smoothed Particle Hydrodynamics (CRKSPH), which directly addresses some of the shortcomings of traditional SPH methods compared to Eulerian adaptive mesh refinement methods, such as the suppression of mixing.
Parallelization of the long-range force calculation uses MPI. Short-range force algorithms, which depend on the architecture, are expressed in the appropriate programming model – thus the approach can be characterized as “MPI + X”, where, in this case, X can be OpenMP, CUDA, or OpenCL. HACC computational kernels have been heavily optimized for architectures including Mira and Titan. Scaling and performance as a fraction of machine peak speed are exceptional.
- The main optimization task will be to optimize the short-range solver (gravity + hydro) for KNL.
- The HACC hydro currently only has a prototype implementation. The main developer will focus on a GPU version for the CAAR project. A new team member (JD Emberson) joining in January 2016 will take on some of the hydro implementation tasks.
- Different subgrid models will need to be implemented, drawing on the experience of collaborators in this area.
- CosmoTools will also need to be enhanced significantly for the hydro version as many new baryon-specific analysis and visualization components will need to be added. Theta’s large memory (192 GB) per node is not only useful for the increased memory required by the baryonic tracer particles, but allows the implementation of in situ analysis algorithms that would otherwise consume too much memory.
HACC addresses portability with a code structure that flexibly adapts to architectures. First, it combines MPI with a variety of more local programming models (e.g., OpenMP, OpenCL, CUDA). Second, while the main code infrastructure (C/C++/MPI) remains unchanged, the short-range force modules can be plugged in and out as required. For a given architecture, not only may the programming model be different, but even the algorithms are allowed to change. The team has recently started to implement fully portable analysis tools using PISTON and NVIDIA’s Thrust library, which provides CUDA, OpenMP, and Intel TBB backends for data-parallel primitives. These primitives include such operators as scan, transform, and reduce, each of which can be customized with user-defined functors. HACC has been benchmarked on KNL simulators and some of the analysis tools have been run on Stampede.