Onboarding Guide

USER TIP: For an introduction to using ALCF resources, view presentations on Getting Started on Theta and Cooley or Getting Started on Mira

Step 1. Request an ALCF Project

Step 2. Get an ALCF User Account

Step 3. Logging in to an ALCF Resource

Step 4. Setting Up Your Computing Environment

Step 5. Data

Step 6. How to Run a Job

Step 7. Managing Your ALCF Account

Step 8. ALCF Acknowledgement Policy

Step 9. Getting Additional Support

 

Step 1. Request an ALCF Project

You must have a project established on the system you will be using before you can proceed. If you do not have a project, please visit www.alcf.anl.gov/user-guides/how-get-allocation to establish one.

Step 2. Get an ALCF User Account

In order to access ALCF resources, users need to have an account on our systems. The following steps provide some guidelines on how to request a user account at the ALCF.

  • Please go to https://accounts.alcf.anl.gov/ and click on 'Request An Account'.
  • Input your email address and other form fields and click 'Verify Email Address'.
  • You will receive an ANL Verification code from accounts@alcf.anl.gov
  • Enter the verification code on the Request A New Account page.
  • Follow the steps outlined within the account request form, making sure ALL personal information is filled in thoroughly and accurately.
  • All fields are mandatory unless marked optional.
  • After you have finished filling out the form, read through the Argonne National Lab Computer User Agreement and check the boxes
  • Click 'Request Account'.

Step 3. Logging into an ALCF Resource

The logins, also known as the front-end nodes, are the nodes that users access through the ssh command when connecting to the ALCF systems. These nodes allow for interactive activities such as editing files and compiling. The compute nodes are not directly accessible to the user, but are where users code is executed when submitted with the qsub command. An important aspect of the computing resources to note is that the hardware and operating systems of the login and compute nodes are different.

Activating/Enrolling Your Cryptocard token

  • Physical token: After you have received your Cryptocard token and before you can use it, you must call the ALCF Help Desk (630-252-3111 or 866-508-9181) for us to verify your identity and activate the token. If you do not perform this step, you will not be able to log on to ALCF resources using your Cryptocard token.
  • Mobile token (MobilePASS/MobilePASS+): Instructions to enroll your token will be sent to you in an email along with the self enrollment link. The link is valid for  is 14 days after which, you will have to contact us so we can provision you a new enrollment link.

Logging In Using Your Cryptocard

  • To log in to an ALCF resource—for example, Vesta—from a Unix machine, ssh to vesta.alcf.anl.gov: ssh vesta.alcf.anl.gov. If your username on your local machine is different from your username on ALCF resources, you will need specify your ALCF account username within the SSH command. This can be done in one of two ways: by using the -l option or by prepending your username to the hostname.
    • ssh -l <alcf_username> vesta alcf.anl.gov
    • ssh <alcf_username>@vesta.alcf.anl.gov
       
  • When prompted for a password, the user must:
    • type in the PIN followed by the one-time password (Cryptocard password obtained by pressing the button on your token) in case of physical token
    • type in the passcode that is generated on the mobile app, after typing in the PIN into the app

If you have any problems logging in using the Cryptocard please refer to the help section for troubleshooting my cryptocard or call the ALCF Service Desk (630-252-3111 or 866-508-9181).

Step 4. Setting Up Your Computing Environment

Setting Up Your Shell

Once you are logged into the XC40, BG/Q, or viz cluster system, you will be sitting at a UNIX shell prompt in your home ($HOME) directory. If you have never used UNIX before, you will need to learn some basic commands before you will be able to submit jobs. A number of links to tutorial information can be found at the Unix Guru Universe Beginners' Pages (external link). Your account will be set up with the default shell bash, unless you requested a different shell in your account request.

We also provide tcsh and zsh. You can change your shell by logging in to your account web page. Scroll down to the "Unix Shell" section and choose your new shell.

Setting Up Your Software Environment

XC40 Systems

We use the modules environment management system. There is a default set of modules set up for all users which allows basic compilation of applications to run on the compute nodes. The basic commands to know are:

  • module list – display the modules you currently have loaded
  • module swap – switch a currently-loaded module for an alternative
  • module load – load a new module
  • module unload – unload a loaded module
  • module avail – list all available modules, covering different compilers, libraries, and tools

See the module main page and our web page on modules for more details.

BG/Q Systems

A software environment management system called SoftEnv is used to control system path and other environment variables required to run application software on the BG/Q. The first time a user logs on to BG/Q, a '.soft' configuration file will automatically be created in the user's home directory. This file will be set up with the default applications environment -- typically there will be a single line with '@default' in the file. Depending on the user's applications, further modifications to the .soft file may be necessary.

SoftEnv man pages are available in the default environment (use 'man softenv'). The command 'softenv' will list all available applications. Users wishing to gain a more complete understanding of how softenv works may read the complete softenv documentation ('man softenv-intro').

Viz Cluster

As on BG/Q systems, we use SoftEnv on the viz cluster. Because the architectures are different (login nodes and compute nodes) but they share the same home directories, we keep the software environment setups separate by using a separate ‘.soft.cooley’ configuration file. (Cooley is the name of our current viz cluster.)

Step 5. Data

Transfer Utilities

sftp and scp

These standard utilities are available for local area transfers of small files.

Data Transfer Service

Globus: ALCF makes Globus endpoints available for data transfer to our systems. All ALCF endpoints have names beginning with “alcf#”.  For more information about this service, see Using Globus.

Step 6. How to Run a Job

For extensive information on running and queuing jobs, please visit our detailed web pages. Here's some basic information to get you started.

Cobalt

We use Cobalt, an Argonne-developed scheduler, for batch job submissions to run on compute nodes on all our systems. It is somewhat similar to PBS. The basic commands to know are

  • qsub – submit a job
  • qstat – view the status of queued and running jobs
  • qdel – kill/cancel a job

Note that when you are submitting a script to Cobalt, the script file must be executable (chmod +x myscript.sh).

XC40 Systems

You must write a script to pass to the qsub command, using the aprun command in the script to invoke your executable. Here is an example submission command:

qsub -A <project> -n <nodes> -t <walltime> --mode script –attrs mcdram=cache:numa=quad  ./myscript.sh

Here is an example script:

#!/bin/sh
echo "Starting Cobalt job script"
export n_nodes=$COBALT_JOBSIZE
export n_mpi_ranks_per_node=32
export n_mpi_ranks=$(($n_nodes * $n_mpi_ranks_per_node))
export n_openmp_threads_per_rank=4
export n_hyperthreads_per_core=2
export n_hyperthreads_skipped_between_ranks=4
# see aprun --help for more options
aprun -n $n_mpi_ranks -N $n_mpi_ranks_per_node \
  --env OMP_NUM_THREADS=$n_openmp_threads_per_rank -cc depth \
  -d $n_hyperthreads_skipped_between_ranks \
  -j $n_hyperthreads_per_core \
  --env FOO=$FOO --env BAR=$BAR : myprogram.exe myprogarg

BG/Q Systems

Direct executable, no script

qsub -A <project> -n <nodes> -t <walltime> --mode <mode> --env "FOO=a:BAR=b" ./<exe>

<project> - Assigned project short name.

<nodes> - Number of nodes to use; will be rounded to fit in the nearest partition size.

<walltime> - Maximum wall clock time to allow your job to run, specified in minutes.

<mode> - Number of ranks per node, c1 - 1 rank per node, c8 - 8 ranks per node. Valid values: c1, c2, c4, c8, c16, c32, c64.

<exe> - Binary executable.

Submitting a script

In addition to allowing direct submission of an executable program, Cobalt on BG/Q can also handle scripts. Within your script, use the command runjob to start an execution.

qsub -A <project> -n <nodes> -t <walltime> --mode script --env "FOO=a:BAR=b" ./myscript.sh

Here is an example script:

#!/bin/sh
echo "Starting Cobalt job script"
# see runjob --help for more options
runjob --block $COBALT_PARTNAME --np $(($COBALT_JOBSIZE*16)) --ranks-per-node 16 \
  --verbose 2 --envs FOO=$FOO --envs BAR=$BAR : myprogram.exe myprogarg

Viz Cluster

You must write a script to pass to the qsub command, using the mpirun command in the script to invoke your executable. Here is an example submission command:

qsub -A <project> -n <nodes> -t <walltime> --mode script ./myscript.sh

Here is an example script:

#!/bin/sh
NODES=`cat $COBALT_NODEFILE | wc -l`
PROCS=$((NODES * 12))
mpirun -f $COBALT_NODEFILE -n $PROCS /path/to/binary/myprogram.exe

Step 7. Managing Your ALCF Account

Updating Your Contact Information

In order to communicate effectively with users and to abide by DOE regulations, we must have the most current information for each user. Please keep your personal information up to date. This is important so that we can contact you.

To update your information, please login to the ALCF accounts page. You will need to use your Cryptocard token (physical or mobile) to login to the website.

Step 8. ALCF Acknowledgement Policy

As an active Argonne Leadership Computing Facility (ALCF) user conducting research on an ALCF resource, we request your continued cooperation and compliance with the work acknowledgment policy pertaining to your computing allocation award. For guidance on acknowledgements, please refer to this policy page on our website: http://www.alcf.anl.gov/user-guides/alcf-acknowledgment-policy.

Step 9. Getting Additional Support

Training Opportunities

ALCF conducts workshops and other events to train new users, help scale existing projects, and introduce new techniques to experienced users. Click to view upcoming training events.

Contact Us

Emailing support is the preferred method of requesting assistance. ALCF Service Desk staff is also available for phone-based support, primarily for troubleshooting Accounts related issues.

Email: support@alcf.anl.gov

Telephone: 630-252-3111 or 1-866-508-9181 (Toll free, US only)