Blue Gene/Q Versus Blue Gene/P

Machine Architecture: Mira/Cetus/Vesta vs. Intrepid/Challenger/Surveyor

The new Mira/Cetus/Vesta systems (ALCF2) use IBM BG/Q systems which are similar to the older Intrepid/Challenger/Surveyor IBM BG/P systems (ALCF1) but offer various evolutionary improvements.

Hardware

Mira/Cetus/Vesta
(BG/Q)

Intrepid/Challenger/Surveyor (BG/P)

Architecture

64-bit Power A2

32-bit PowerPC 450d

Cores per node

16(*)

4

Clock Speed

1600 MHz

850 MHz

Memory per node

16 GB

2 GB

Floating-point unit

4-way SIMD (QPX)

2-way SIMD (Double Hummer)

Network

5D Torus

3D Torus + Tree

* - 17th core may be used by communication library.

For additional information on the system see: http://www.alcf.anl.gov/user-guides/system-overview.

Access

All BG/Q systems now require the use of a CRYPTOCard. No BG/Q system supports the use of ssh keys.

For additional information on CRYPTOCCards, see: http://www.alcf.anl.gov/user-guides/using-cryptocards.

SoftEnv

ALCF2 continues to use the SoftEnv system to control user environments but the ‘rc’ file has changed names. The new file is ‘.soft.’ The initial file should only contain one key, which is ‘@default.’

 

Mira/Cetus/Vesta

(BG/Q)

Intrepid/Challenger/Surveyor (BG/P)

Softenv file ~/.soft  ~/.softenvrc

 

Compiling

The default choice for MPI wrappers is no longer provided. There are six possible MPI wrapper choices. You can choose one of the following soft keys and add it to your .soft file.

  • +mpiwrapper-xl
  • +mpiwrapper-xl.legacy
  • +mpiwrapper-xl.ndebug
  • +mpiwrapper-xl.legacy.ndebug
  • +mpiwrapper-gcc
  • +mpiwrapper.gcc-legacy

For additional information on compiling and linking, see: http://www.alcf.anl.gov/user-guides/compiling-linking.

Software

All ALCF-provided software is under /soft. ALCF2 has restructured the /soft filesystem to have additional subdirectories other than just /soft/apps. Libraries will be found under /soft/libraries, performance tools are under /soft/perftools, and debuggers are under /soft/debuggers.

For additional information on software, see: http://www.alcf.anl.gov/user-guides/software-and-libraries.

Job Submission

ALCF2 continues to use the Cobalt job scheduler and all basic client commands remain the same. The job submission command, qsub, has a modified flag ‘--mode’ which takes a new set of values. The possible values determine the number of MPI ranks per compute node.

Example:
qsub -A <project> -n <nodes> -t <walltime> -q <queue> --mode <c1,c2,c4,c8,c16,c32,c64>
c1 is one rank per compute node, c2 is two ranks per compute node, …, up to c64 which is sixty-four ranks per compute node.

For additional information on running jobs see: http://www.alcf.anl.gov/user-guides/queueing-running-jobs.

Script Jobs

Script jobs are still fully supported on ALCF2 machines but instead of using ‘cobalt-mpirun’ to launch jobs, the ‘runjob’ command should be invoked. The ‘runjob’ command has new parameters to reflect the new BG/Q options. Here is an example:

#!/bin/bash
#script.sh
rpn=16 # MPI ranks per node
runjob --block $COBALT_PARTNAME --np $(($COBALT_JOBSIZE*$rpn)) --ranks-per-node $rpn : /path/to/binary <arguments to binary>

For additional information on submitting a script job, see: http://www.alcf.anl.gov/user-guides/running-jobs#submitting-a-script-job

Rank Mapping

The usage of BG_MAPPING and the placement of ranks based on the value is reversed from /P to /Q. On BG/P, setting BG_MAPPING with the value set to XYZT would result in the ranks being placed on the X dimension first, then the Y, then Z and finally T. On BG/Q, setting the value of BG_MAPPING to ABCDET will result in the ranks being placed first on T, then E, then D, then C, then B and finally A.