Singularity on Theta

Help Desk

Theta and ThetaGPU

These instructions were presented in our Web Developer Series (Feb 2019). Slides are found here: https://anl.box.com/s/b9366zeqq10ufk3laxzb0dkjf8d96bvt

Examples of Singularity Recipes can be found here: https://github.com/jtchilders/singularity_image_recipes

A Note on Docker

ALCF does not support Docker on our resources as it has security problems that allow privilege escalation. Singularity was developed with these security issues in mind.

Singularity on ALCF Resources

Singularity is a container solution for application science workloads. For details on Singularity and its usage, see the Singularity user guide. This page is dedicated to information pertaining to Theta at the ALCF. Please send questions to support at alcf.anl.gov.

Building a Singularity Container

Singularity containers can be built directly from docker images on Docker Hub using

singularity build new_local_image.simg docker://username/image_name:image_version

Otherwise an image must be built ones own computing resources. Installations instructions can be found here: Install Singularity

Running a container

You can run a container on Theta using this style of batch script:

#!/bin/bash
#COBALT -t 30
#COBALT -q debug-cache-quad
#COBALT -n 2
#COBALT -A datascience

# pass container as first argument to script
CONTAINER=$1

# Use Cray's Application Binary Independent MPI build
module swap cray-mpich cray-mpich-abi

# include CRAY_LD_LIBRARY_PATH in to the system library path
export LD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:$LD_LIBRARY_PATH
# also need this additional library
export LD_LIBRARY_PATH=/opt/cray/wlm_detect/default/lib64/:$LD_LIBRARY_PATH
# in order to pass environment variables to a Singularity container create the variable
# with the SINGULARITYENV_ prefix
export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH
# print to log file for debug
echo $SINGULARITYENV_LD_LIBRARY_PATH

RANKS_PER_NODE=4
TOTAL_RANKS=$(( $COBALT_JOBSIZE * $RANKS_PER_NODE ))

# this simply runs the command 'ldd /myapp/pi' inside the container and should show that
# the app is running agains the host machines Cray libmpi.so not the one inside the container
# run my contianer like an application, which will run '/myapp/pi'
aprun -n $TOTAL_RANKS -N $RANKS_PER_NODE singularity run -B /opt:/opt:ro -B /etc/alternatives:/etc/alternatives $CONTAINER

 

There are some important items in this build script. It downloads and builds MPICH. The configure command requires the 'disable-wrapper-rpath' setting. This ensures that when you use 'mpicc' inside the container to build your application (in this case 'pi.c'), that MPICH does not insert its path into the RPATH of the binary compiled. If you do not disable this feature, any binary built using the MPICH installation will be forced to use this MPICH library. As we intend to run our application on Theta, we must use the Cray MPI installation to get the correct cross-node communication. This also means that you must ensure that the application is dynamically linked against MPICH and not statically linking MPICH into the binary.

Next create a file named "SingularityFile" and copy and paste the following inside:

Bootstrap: docker 
From: centos 
%setup 
 echo ${SINGULARITY_ROOTFS} 
 mkdir ${SINGULARITY_ROOTFS}/myapp 

%files 
 /vagrant_data/pi.c /myapp/ 
 /vagrant_data/build.sh /myapp/ 

%post 
 yum update -y 
 yum groupinstall -y "Development Tools" 
 yum install -y gcc
 yum install -y gcc-c++ 
 yum install -y wget 

cd /myapp 
./build.sh 

%runscript 
/myapp/pi

Running Singularity on Theta

Singularity has been setup to automatically mount a number of directories, including proc, sys, and $HOME into the running image.

You can 'bind-mount' additional directories into the container with the -B flag, for instance the Cray/Intel programming environment exists inside the /opt directory:

singularity exec -B /opt:/opt my_image.img <executable>

This will run the application in the SingularityFile recipe. 

You can also run other things with this command:

singularity exec -B /opt:/opt my_image.img ls /myapp

In order to run the example MPI application built in the previous example we need to create a Cobalt submit script. Copy and paste the following into a file named "submit.sh":

#!/bin/bash 
#COBALT -t 30 #COBALT -q debug-cache-quad 
#COBALT -n 2 
#COBALT -A datascience
 # pass container as first argument to script 
CONTAINER=$1 
# Use Cray's Application Binary Independent MPI build module swap cray-mpich cray-mpich-abi 
# include CRAY_LD_LIBRARY_PATH in to the system library path 
export LD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:$LD_LIBRARY_PATH 
# also need this additional library 
export /opt/cray/wlm_detect/default/lib64
# in order to pass environment variables to a Singularity container create the variable # with the SINGULARITYENV_ prefix 
export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH 
# print to log file for debug 
echo $SINGULARITYENV_LD_LIBRARY_PATH 
RANKS_PER_NODE=4 
TOTAL_RANKS=$(( $COBALT_JOBSIZE * $RANKS_PER_NODE )) 
# this simply runs the command 'ldd /myapp/pi' inside the container and should show that # the app is running agains the host machines Cray libmpi.so not the one inside the container # run my contianer like an application, which will run '/myapp/pi' 
aprun -n $TOTAL_RANKS -N $RANKS_PER_NODE singularity run -B /opt:/opt:ro -B /etc/alternatives:/etc/alternatives $CONTAINER

Then you can submit your job with 'qsub -n <Number_of_nodes> -q <queue_name> -t <time> -A <project_name> submit.sh <container name>'

Checking Singularity version

Theta usually runs the current stable version of Singularity. To get the version number type

$ singularity --version

Mitigating userid errors when scaling up

Theta Singularity users may encounter errors such as unknown userid stemming from excessive outbound LDAP requests originating from the compute nodes.  These errors are likely to occur in either large scale MPI runs or ensemble jobs containing multiple Singularity runs.  To mitigate the issue, users should prime the local credential cache on each compute node by the following procedure at the beginning of their Cobalt job script:

# prepare credential cache to avoid LDAP issues
# Run in increments of 1024 nodes, up to the full value of $COBALT_JOBSIZE
# Uncomment lines as necessary below:

#aprun -N 1 -n 1024 /soft/tools/prime-cache
#sleep 10
#aprun -N 1 -n 2048 /soft/tools/prime-cache
#sleep 10
#aprun -N 1 -n 3072 /soft/tools/prime-cache
#sleep 10
#aprun -N 1 -n 4096 /soft/tools/prime-cache
#sleep 10
# aprun -N 1 -n $COBALT_JOBSIZE /soft/tools/prime-cache
Systems
Topics