ALCF Data Science Program

ALCF Data Science Program

 

The ALCF Data Science Program (ADSP) aims to accelerate discovery across a broad range of scientific domains which require data-intensive and machine learning algorithms to address challenging research problems. This program connects leading researchers with ALCF scientists to push the state-of-the-art in machine learning, workflows, data analysis, and algorithmic development.

Successful projects are given the opportunity to partner with outstanding research teams, both in the Data Science group and at the ALCF, to deploy scalable learning and data analysis on leadership computing resources. The ADSP can be used to achieve computational readiness for ALCF programs such as INCITE and run on future systems such as Aurora, a new Intel-Cray system which will be capable of over 1 exaflops, that will be deployed in 2021.

Call for Proposals

The ADSP employs a competitive proposal process that awards allocations of time on ALCF supercomputers. All proposals are peer reviewed by a panel of experts for both the scientific impact and computational readiness. The ADSP is for two years and are awarded to researchers from academia, government research facilities, and industry. PIs will be required to submit a renewal application for the second year of the award. If you have any questions, contact adsp@alcf.anl.gov

SUBMISSION DEADLINE: July 1, 2019 at 5 PM CDT

Program Overview

Ongoing and past ADSP projects span a diverse range of science domains, e.g. Materials, Imaging, Neuroscience, Engineering, Combustion/CFD, Cosmology; and involve large science collaborations (APS, LSST, DESC, LIGO, DES, ATLAS) and smaller research groups developing machine learning methods at scale. ADSP projects benefit from hardware and system architectures at ALCF which support data analysis and machine learning with a common software stack to allow for large scale science campaigns. The program also benefits the selected projects by offering directed assistance and computational time to gain experience scaling their codes on ALCF systems, leading to future INCITE and ALCC proposals and projects on DOE supercomputing systems.

Scientists at the ALCF partner with each ADSP project assisting in code and methods development, optimization, workflow creation and data analysis and visualization. Example techniques and areas of research, which would leverage ALCF’s experience, include hyperparameter optimization, uncertainty quantification, statistics, machine learning, deep learning, databases, pattern recognition, image processing, graph analytics, data mining, real-time data analysis, and complex and interactive workflows.

ALCF Computing Resources

ADSP project teams will have access to ALCF computing resources:

  • Theta has 4,392 nodes, each with a KNL 64-core processor having 16 gigabytes of high-bandwidth in-package memory and 192 gigabytes of DDR4 RAM. Each node has 128GB node-local SSD storage. The aggregate peak compute speed is 11.69 petaflops. Storage includes a 10 petabytes Lustre parallel file system.
  • Cooley is a visualization and analysis cluster with 126 compute nodes; each node has 12 CPU cores and one NVIDIA Tesla K80 dual-GPU card. The entire Cooley system has a total of 47 terabytes of system RAM and 3 terabytes of GPU RAM.

ADSP Resources

  • Staff and Postdoc Support: The chosen ADSP projects will receive support from the Data Science group, a multidisciplinary team of scientists and high-performance computing software engineers. Top tier selected projects may be funded in part by Data Science postdocs.
  • Training and Hardware Access: The ALCF will offer one-on-one assistance for R&D. Depending on the requirements of the projects, this may include a detailed introduction to the hardware and software stack, access to early hardware, deep dives on specific hardware features, and customized tutorials.
  • Computing and Storage Resources: ADSP projects will be awarded compute time and storage space on Theta. The initial compute time awards are expected to be in the range of a few to tens of millions of core-hours. Awards for the second year will be based on progress in the first year and consultation with the project’s PI. Initial storage requirements may be up to 100 terabytes; additional project needs will be accommodated in consultation with the ALCF.

Eligibility Requirements

This call is open to US- and non-US-based researchers and research organizations in universities, academia, industry, national laboratories and other research institutions needing large allocations of computing time, supporting resources, and data storage. DOE sponsorship is not required to participate. Note that there are federal laws regulating what can be done on ALCF systems. As an example, Classified Information, National Security Information or Unclassified Controlled Nuclear Information cannot be stored on our systems.

Reporting Requirements

ADSP project teams are expected to provide quarterly progress reports, to participate in update calls, collaborate with ALCF staff, help prepare highlights of notable accomplishments and results, and provide a written report at the end of the project.

Submission Instructions

Evaluation of Proposals

Proposals will be evaluated on the strength of:

  • Potential impact of proposed science on the respective domain.
  • Demonstrated scalability of the target application on Theta (or on a comparably large cluster)
  • Description of the datasets and plans to realize the data science.
  • Appropriateness of development team: the likelihood that project member’s expertise and person-hours proposed are likely to accomplish the science goals or software development described.
  • Overall diversity of science domains and algorithms.