ALCF Data Science Program

ALCF Data Science Program


The ALCF Data Science Program (ADSP) is seeking proposals that will push the state-of-the-art in data-centric and data-intensive computing; as well as in machine learning, deep learning, and other AI methods at scale. We invite proposals for projects looking to gain insights into very large scientific datasets (experimental, simulation, or observational) and carry out advanced analytical methods using data science and machine learning techniques.

From April 27, 2018, to June 20, 2018, ADSP’s open call provides an opportunity for researchers to propose transformational advances in data science and software technology through allocations of computer time and supporting resources at the ALCF.

Call for Proposals

The call for proposals opens April 27, 2018, and ends on June 20, 2018, 5:00 PM CST. Please see the proposal instructions for more information.

The ADSP, now in its third year, targets “big data” science problems that require the scale and performance of leadership computing resources. ADSP projects are two-year awards. PIs will be required to submit a renewal application for the second year of the award.

[Proposal Instructions]

Program Overview

ADSP projects will focus on employing leadership-class systems and infrastructure to explore, develop, and advance a wide range of data science techniques. These techniques include uncertainty quantification, statistics, machine learning, deep learning, databases, pattern recognition, image processing, graph analytics, data mining, real-time data analysis, and complex and interactive workflows. The winning proposals will be awarded time on ALCF resources and will receive support and training from dedicated ALCF staff. Applications undergo a review process to evaluate potential impact, data-scale readiness, diversity of science domains and algorithms, and other criteria. This year there will be an emphasis on identifying projects that can use the architectural features of Theta in particular, as future ADSP projects will eventually transition to Aurora, ALCF’s upcoming Intel-Cray system.

Computing Platforms

ADSP project teams will have access to ALCF computing resources, including Theta, the 11.69-petaflops system based on the second-generation Intel® Xeon Phi™ processor, and Mira, the 10-petaflops IBM Blue Gene/Q system, as well as visualization and analytics clusters, and storage systems. 

ADSP Resources

  • Staff and Postdoc Support: The chosen ADSP projects will receive support from the Data Science group, a vibrant multidisciplinary team of scientists and high-performance computing (HPC) software engineers. Selected projects may be funded in part by Data Science postdocs.
  • Training and Hardware Access: ALCF will offer targeted training for the ADSP projects. Depending on the requirements of the projects, this training is likely to include a detailed introduction to the hardware and software stack, access to early hardware and deep dives on specific hardware features, and customized tutorials.
  • Computing and Storage Resources: ADSP projects will be awarded compute time and storage space on Theta and/or Mira. The initial compute time awards are expected to be in the range 20-30M core-hours. Awards for the second year will be based on progress in the first year and consultation with the project’s PI. Initial storage requirements may be up to 100 terabytes; additional project needs could be accommodated in consultation with the ALCF.

Eligibility Requirements

This call is open to US- and non-US-based researchers and research organizations in academia, industry, national laboratories and other research institutions needing large allocations of computer time, supporting resources, and data storage. DOE sponsorship is not required to participate. Note that there are Federal laws regulating what can be done on ALCF systems. As an example, Classified Information, National Security Information or Unclassified Controlled Nuclear Information cannot be stored on our systems.

Reporting Requirements

ADSP project teams are expected to provide quarterly progress reports, participate in update calls, collaborate with ALCF staff, help prepare highlights of notable accomplishments and results, and provide a written report at the end of the project.