QBox

What is Qbox?

Qbox is a C++/MPI scalable parallel implementation of first-principles molecular dynamics (FPMD) based on the plane-wave, pseudopotential formalism. Qbox is designed for operation on large parallel computers.

Obtaining Qbox

http://eslab.ucdavis.edu/software/qbox

Building Qbox for Blue Gene/Q

Qbox requires the standard math libraries plus the Xerces-C http://xerces.apache.org/xerces-c.

Xerces-C 3.1.1

In the xerces directory

   ./configure --disable-shared  --disable-pretty-make --disable-threads --enable-transcoder-iconv CC=mpixlc_r CXX=mpixlcxx_r CFLAGS=-O2 CXXFLAGS=-O2  --prefix=${HOME}/xerces-c-3/ 
   make
   make install

The libraries and headers will be installed in the following paths:

Libraries: ${HOME}/xerces-c-3/lib

Headers: ${HOME}/xerces-c-3/include

Compiling Qbox

Download our architecture dependent makefile <link to file bgq-anl.mk> and copy it into the Qbox src. You can now build Qbox using make.

   export TARGET=bgq-anl
   make

Performance Notes

The text below is taken from a discussion on the Qbox user forum regarding NROWMAX.

The nrowmax variable is used to determine the shape of the rectangular process grid used by Qbox. This process grid is the one used by the Scalapack library. When Qbox starts, the ntasks MPI tasks are assigned to processes arranged in a rectangular array of dimensions nprow * npcol. The default value of nrowmax is 32. The plane-wave basis is divided among nprow blocks, and the electronic states are divided among npcol blocks.

The following algorithm is used by Qbox to determine the values of nprow and npcol:

1. The number of rows nprow is first set to nrowmax.

2. The value of nprow is then decremented until ntasks%nprow==0, i.e. nprow divides the total number of tqasks.

3. The value of npcol is then given by ntasks/nprow.

While this looks cryptic, this algorithm tries to achieve is actually quite simple: define a process grid of dimensions nrowmax*npcol, where npcol=ntasks/nrowmax. This is not always possible in particular if ntasks%nrowmax != 0. This is why the second part of the algorithm decrements nprow until ntasks%nprow==0.

Note: The value of nprow is never larger than nrowmax (hence the name).

This algorithm is implemented in Wavefunction::create_contexts() in file Wavefunction.C

Examples:

  • ntasks=128, nrowmax=32 (default) => process grid 32 x 4
  • ntasks=48, nrowmax=32 (default) => process grid 24 x 2
  • ntasks=256, nrowmax=64 => process grid 64 x 4

The shape of the process grid affects performance. In general, it is advantageous to have nprow as large as possible, but not larger than the size of the (fine) FFT grid in the z direction. For example, if the fine FFT grid (printed as np0v,np1v,np2v on output) is 110 x 110 x 110, the value of nrowmax should be 110. Note that other values of nrowmax also work, but performance is usually inferior. For example, one could use nrowmax=128 even if the grid is 110x110x110, but some of the processes will not be used optimally during FFTs.

Choosing the value of nrowmax is usually a trial-and-error process. Before running long simulations, it is advisable to run a few test jobs with different values of nrowmax and choose the value that gives best performance.