Determining Allocation Requirements

Estimating CPU-Hours for ALCF Blue Gene/Q Systems

When estimating CPU-hours for the ALCF Blue Gene/Q systems, it is important to take into consideration the unique aspects of the Blue Gene architecture in order to achieve an accurate estimate.

Is the BG/Q Right for Your Job?

A detailed hardware overview can be found on the Machine Overview page.  This material is critical to understanding whether the job can run. For example, each node of Blue Gene/Q has 16 GB of memory. Depending on your level of threading, single MPI processes might have as little 256 MB per process or up to 16GB per process. 

Understanding BG/Q Blocks

Like all Blue Gene architecture, the Blue Gene/Q allocates nodes in partitions (also called “blocks”). The partition sizes available for all platforms can be found on the Machine Partitions page. Jobs can only be run within a single partition and no other users have access to that partition while it is being used.

It is important to keep in mind that CPU time is charged based on partition size, not job size. If the number of nodes required by a job is not a power of two, the project must round up to the nearest power of two in order to determine the correct allocation to request (with a minimum size of 32, 128, or 512 nodes, depending on the system). If the job does not fit entirely in a partition, the unused nodes remain idle and are not available for use by anyone else; this is why we always charge for the entire partition.

For example, a 6000-node job must run on a 8192-node partition, which means the project will be charged for (8192 nodes x 16 cores) = 131072 CPU hours. Depending on how many threads the application uses, that is anywhere from 6000 to 131072 MPI processes.