Runjob termination on BG/Q

A compute-node execution is initiated by the runjob command (Blue Gene/Q's equivalent of mpirun).  The termination of runjob is determined in turn by how the ranks of your MPI program terminate:

  1. An MPI rank calls exit(0)  ---> wait for all ranks to call exit(), then terminate
  2. An MPI rank calls exit(1)  ---> kill all ranks immediately and terminate
  3. An MPI rank calls exit(n) for n>1  ---> wait for all ranks to call exit(), then terminate ***
  4. any uncaught signal ---> kill all ranks immediately and terminate

*** Note that Case 3 often results in deadlock until the job's queued time runs out, because the missing rank cannot participate in collective operations.

You can modify this behavior by setting the environment variable BG_EXITIMMEDIATLYONRC=1 using the option runjob --envs.  In that case, the behavior is:

  1. An MPI rank calls exit(0)  --->  wait for all ranks to call exit(), then terminate
  2. An MPI rank calls exit(n) for n>=1  ---> kill all ranks immediately and terminate
  3. any uncaught signal ---> kill all ranks immediately and terminate