Aurora Early Science Program Proposal Instructions

Aurora ESP

 

The Call for Proposals for the Aurora Early Science Program is now closed.

 

General Information and Submission Instructions

Our intent is for Aurora Early Science Program (ESP) proposals to be relatively simple and short—a stripped-down version of an INCITE proposal. The sections of the proposal are

1.PI and co-PI information

2. Project Summary

  • Executive Summary
  • Benefit to Community
  • Science Summary
  • Application Summary

3. Estimate of Resources Required

4. Portability

5. Project Team Members

6. Commitments/Expectations

For details on how proposals will be evaluated, see the Aurora ESP Call for Proposals webpage.

Submission

  • Submission deadline: September 2, 2016 before midnight in any time zone
  • We are using the EasyChair system for proposal submission. You’ll need to create an account if you don’t have one already, and login to the Aurora ESP EasyChair website.
  • Prepare your proposal using the instructions below
  • Submit as a single PDF document, by using EasyChair to upload. You may resubmit with revisions as needed up until the deadline.

Please direct any questions to earlyscience@alcf.anl.gov. If needed, contact Tim Williams at 630-252-1154.

Proposal Instructions

Please create your proposal document with a project title, and the section headings noted below. We provide a template Word document for those that want to use it:

Section 1: PI and co-PI Information

1a. Principal Investigator (PI) Information

  • Last Name, First Name, Title (Dr., Mr., Ms., etc.)
  • Institution
  • Street address
  • Email address

1b. Co-Principal Investigator (co-PI) Information

For each co-investigator:

  • Last name, first name, title (Dr., Mr., Ms., etc.)
  • Institution
  • Street address
  • Email address

Section 2: Project Summary

2a. Executive Summary

Write an executive summary that accurately describes your proposed research and the high-impact scientific advances you will achieve with access to early resources at the ALCF. (1/2 page)

2b. Benefit to Community

Write a description of the benefit your project will provide to the science and HPC community. (1/2 page)

2c. Impact Statement

Provide a two-sentence project summary that can be used to describe the impact of your project to the public (50 words maximum).

2d. Science Summary

Write a description of the science problem you would like to address in the early 2019 time frame. Include research that will need to be completed in the next two years to lead up to this work (1 page)

2e. Application Summary

2e.i. Application Requirements

Write a list of your application requirements, including languages, libraries, and current parallel method (MPI, OpenMP, etc.) (1 page).

2e.ii. Application Description

Write a description of the current application, including methods, parallelization, I/O, etc. (1 page).

2e.iii. Application Development Needed

Write a description of the code and/or algorithmic development you believe will be necessary to exploit an increase in parallelism per-node and an increase in overall levels of parallelism. Include work that will be needed in MPI parallelism. Consider here how you might use the memory hierarchy on the KNH nodes—the at least 16 GB of high-bandwidth on-package memory (HBM) and some combination of off-package DRAM and NVM (nonvolatile memory). There's a presentation from Intel that discusses the three modes of using the high-bandwidth memory on Knights Landing CPUs cache for DRAM, flat, or hybrid), which you can use as a guide for Knights Hill. Memory bandwidth bound applications with good strong scaling may consider running entirely from the HBM. The simplest and possibly best approach for some codes might be using the entire HBM automatically as a cache for off-package memory. (1 page).

Section 3: Estimate of Resources Requested

If you haven't already, you should look at the slide presentation "ALCF-3 Early Science Program", which has some relevant machine performance information. You'll be making two CPU resource requests.

The first request is for development time on our current machines, which should be a modest request of on the order of one or a few million core­hours at most. This is development work that does not depend strongly on having the new hardware: implementing new algorithms, adding new physics modules, introducing or scaling up of threads, etc.

The second request is for Early Science period time on our next­generation machine, Aurora. This is a large request, for the CPU time you'll need to run your proposed science problem. The total amount of Aurora CPU time available for Early Science is on the order of 6.5 billion KNH core hours, and there will be only 10 projects awarded; use this to check whether your Aurora time request is in the right ballpark. If you have an estimate of how many current­generation BG/Q core hours your science runs would consume, one reasonable conversion factor to use is the estimate derived from numbers in the  aforementioned slides: Aurora applications can expect an 18x performance increase on the full machine relative to the full BG/Q machine Mira (based on ratio of peak speeds of 180 PFLOPS and 10 PFLOPS). Since BG/Q has 16 cores per node, and KNH may have many more cores per node, you'll need to keep that factor in mind in expressing your time estimates in Aurora core hours. (There is no disclosed estimate for KNH core count; use 60 cores for your proposal core-hour estimates.) However you make your estimates, please explain your estimation method; the "brief schedule" documents from your proposal are a good place to do this explanation.

Express your CPU time requests in millions of core hours (BG/Q and/or Theta (KNL) core hours for the first request and KNH core hours for the second request.

3a. Current-Generation System (Mira/Theta) Resources:

  • Mira and/or Theta time in core-hours
  • Disk space in TB
  • Tape archive space in TB
  • Brief schedule for how you would use that time on Mira and/or Theta to prepare for early access to next-generation hardware and the final next-generation system: scaling tests, development (e.g. algorithms, physics modules), verification, parameter sweeps, porting to Xeon Phi architecture, etc. Assume that your Mira and/or Theta access begins on 1 January 2017 and continues until the start of the Early Science period on Aurora (1 January 2019; exact date subject to change). Break this down into milestones as appropriate for your project. (1/2 page).

3b. Next-Generation System (Aurora) Resources:

  • Aurora time in core-hours
  • Disk space in TB
  • Tape archive space in TB
  • Breakdown for how you would use time on Aurora to make final preparations for science runs, and for the science runs themselves. Preparations might include final scaling tests, science problem spin-up runs, etc. For the science runs themselves, estimate the total core-hours and breakdown into separate components/milestones as appropriate. You should plan for completing all of this during the (approximately) three-month Early Science period, when you and the other Aurora ESP projects will have dedicated pre-production access. Early Science starts on 1 January 2019 (exact date subject to change). You will have continued access after that three months, but you will be sharing it with all our production users then, and may run at lower priority. (1/2 page).

Section 4: Portability

An important focus of the DOE Leadership Computing Facility going forward is application portability. We see two basic architectural tracks going forward from today’s supercomputers (Mira at ALCF and Titan at OLCF) to the next generation (Theta/Aurora at ALCF and Summit at OLCF), and on to exascale from there. After Aurora and Summit, the next-generation LCF systems will be exascale machines. Mira, Theta, and Aurora represent what we’ll call the many-core CPU track; Titan and Summit represent the CPU-GPU track.

4a. Portability Approach

Discuss briefly your plans, if any, to achieve portability of your projects application(s) across different supercomputer architectures, at least the two aforementioned tracks. (1/2 page).

4b. Participation in Other Applications-Readiness Programs

Indicate whether your team, or others you are aware of using the same code base, have projects under the NERSC NESAP program or the OLCF CAAR program. Also indicate if you have an active project in the ALCF Theta Early Science Program.

Section 5: Project Team Members

5a. Names and Levels of Effort

List the names and levels of effort (as a percentage of full-time) for all team members you expect to do work on the ESP project.

For each person, include a CV. If you have trouble getting all of the CVs into the PDF proposal document you are submitting, email earlyscience@alcf.anl.gov for assistance.

Section 6: Commitments/Expectations

Please confirm that should your proposal be awarded as an ESP project, you will commit to meeting the following three requirements:

  1. Having your institution(s) sign a multiparty NDA (nondisclosure agreement) with Intel and with Cray, so that you may speak with ALCF and other ESP participants about NDA information
  2. Helping recruit an ALCF postdoc to work on your project team in a timely manner. The goal is to hire within the first 6 months of the project
  3. During the first two months of your project (after selection), prepare a detailed project plan with tasks/milestones we can use to document and report progress throughout the time until Aurora is accepted and the Early Science dedicated access period begins; ALCF will help with guidance on this

Indicate this on your proposal by copying the requirements and indicating “Confirmed” next to each.

Good Luck!

Related Documents: