IBM Blue Gene Documents
IBM maintains openly available BlueGene redbooks. Below are links to a search of the IBM redbooks site for 'Blue Gene' and to the main IBM Redbook page where you can see a list of the latest books and search for other RedBooks.
Please note that the documentation for the XL compilers on Blue Gene tends to discuss only BG-specific issues:
Consult the Linux versions of the documents for a more complete description of general XL functionality:
Block (partition) state and job state transitions are show below. The letters highlighted correspond to those reported by bg-listblocks, bg-listjobs, and mmcs_db_console.
* Normal block state transition * * * (C)onfigure ------------------------------>(D)eallocate * ^ | ^ ^ | * | V | | V * -->(F)ree <-->(A)llocated --> (B)ooting --> (I)nitialized --> (T)erminating * ^ | | * ^ ^ |----------| | * | | V | * | -------------(R)ebooting | * | | * | | * | | * ------------------------------------------------- * * Note: In the above diagram, the reboot process will use the (R) status only when * when initiated by mpirun. When reboot is done by an administrator, using * mmcs_db_console, the block goes directly from (I) to (B). * * Abnormal block state transition * * - any error accessing the database will put the block into an (E)rror state, * from which the only state transition is to (F)ree state * - fatal RAS errors while (B)ooting or (I)nitialized will set the block state * to (D)eallocate * * Notes on (C)onfigure and (D)eallocate * * - the (C)onfigure and (D)eallocate states may be set by external components * such as mpirun. These states are monitored by mmcs, and used to trigger * block allocations and deallocations. The other states are only set by mmcs.
* Job State diagram: * * * deleted <--- (Q)ueued ----> ready to (S)tart ----> (R)unning ---> (T)erminated * | ^ * \/ | * (D)ying---------- * * In the two diagrams above, it shows that blocks and jobs can be deleted. * A deleted block or job does not have a state, since it no longer exists. * The reason to include them in the diagrams is simply to show that blocks and * jobs can only be removed when they are in the Free and Queued state, respectively. * * Job State diagram for debugging program prior to running: * * (Q)ueued ---> l(O)ad ---> (L)oaded ---> (B)egin ---> (R)unning * | ^ * v | * (A)ttach Debugger * * Job State diagram for debugging a running job: * * (Q)ueued ---> ready to (S)tart ---> (R)unning * | ^ * v | * Debu(G) *