Onboarding Guide

USER TIP: View "Getting Started" --a video introduction to ALCF services and resources with useful tips to boost your job throughput.

Step 1. Request an ALCF Project

Step 2. Get an ALCF User Account

Step 3. Logging in to an ALCF Resource

Step 4. Setting Up Your Computing Environment

Step 5. Data

Step 6. How to Run a Job

Step 7. Managing Your ALCF Account

Step 8. ALCF Acknowledgement Policy

Step 9. Getting Additional Support

 

Step 1. Request an ALCF Project

You must get a project on the system you will be using before you can proceed. If you do not have a project, please visit www.alcf.anl.gov/user-guides/how-get-allocation to establish one.

Step 2. Get an ALCF User Account

In order to use the resources at ALCF, users need to have an account on our systems. The following steps provide some guidelines on how to request for a user account at the ALCF.

  • Please go to https://accounts.alcf.anl.gov/ and click on 'Request an account'.
  • Click 'Proceed with account request'.
  • Select the appropriate radio button, input your email address, then click 'Proceed to next step'.
  • When prompted to search for an existing account, please enter the relevant information, then hit 'search' to look for an existing user account.
  • If no existing user account is found, hit 'Continue with Account Request'.
  • If you find an existing user account that needs to be reactivated, please fill out the form on the page, then hit the 'Submit Reactivation Request'.
  • From the 'Project Information' page select the project that you would like to be a member of, then scroll to the bottom of the page, and click 'Proceed to next step'.
  • Follow the steps outlined within the account request form, making sure ALL personal information is filled in thoroughly and accurately.
    • Please include any alternate addresses, phone numbers and email addresses (home etc.)
    • Proceed to the next page by choosing the appropriate button ('I am a US citizen', 'I am NOT a US citizen')
  • When you get to the page that specifies 'Type of Account'.
    • Be sure to select 'Annual'
    • Your project's PI name should be in the 'Sponsor Full Name' field. If you are the PI, this will be the name of the ALCF staff member that you are in contact with.
    • Your project PI's email address should be in the 'Sponsor Email Address' field. If you are the PI, then please enter the email address of the ALCF staff member you have provided above.
    • Click 'Proceed to next step'
  • Select the appropriate resource based on the project you selected earlier.
  • Read through the Argonne National Lab Computer User Agreement, then click 'Submit'

Step 3. Logging into an ALCF Resource

The logins, also known as the front-end nodes, are the nodes that users access through the ssh command to the ALCF systems. These nodes allow for interactive activities such as editing files and compiling. The compute nodes are not directly accessible to the user, but are where users code is executed when submitted with the qsub command. An important aspect of the Blue Gene system to note is that the hardware and operating systems of the login and compute nodes is different.

Activating Your Cryptocard

  • After you have received your Cryptocard token and before you can use it, you must call the ALCF Service Desk (630-252-3111 or 866-508-9181) for us to verify your identity and activate the token. If you do not perform this step, you will not be able to log on to ALCF resources using your Cryptocard token.

Logging In Using Your Cryptocard

  • To log in to an ALCF resource, for example Vesta, from a Unix machine, ssh to vesta.alcf.anl.gov: ssh vesta.alcf.anl.gov. If your username on your local machine is different from your username on ALCF resources, you will need specify your ALCF account username within the SSH command. This can be done in one of two ways: by using the -l option or by prepending your username to the hostname.
    • ssh -l <alcf_username> vesta alcf.anl.gov
    • ssh <alcf_username>@vesta.alcf.anl.gov
       
  • When prompted for a password, the user must provide the PIN as well as the one-time password (Cryptocard password obtained by pressing the button on your token) to authenticate.

If you have any problems logging in using the Cryptocard please refer to the help section for troubleshooting my cryptocard or call the ALCF Service Desk (630-252-3111 or 866-508-9181).

Step 4. Setting Up Your Computing Environment

Setting Up Your Shell

Once you are logged into the XC40, BG/Q, or viz cluster system, you will be sitting at a UNIX shell prompt in your home ($HOME) directory. If you have never used UNIX before, you will need to learn some basic commands before you will be able to submit jobs. A number of links to tutorial information may be found at the Unix Guru Universe Beginners' Pages (external link). Your account will be set up with the default shell bash, unless you requested a different shell in your account request.

We also provide tcsh and zsh. You can change your shell by logging in to your account web page. Scroll down to the "Unix Shell" section and choose your new shell.

Setting Up Your Software Environment

XC40 Systems

We use the modules environment management system. There is a default set of modules set up for all users which allows basic compilation of applications to run on the compute nodes. The basic commands to know are

  • module list – display the modules you currently have loaded
  • module swap – switch a currently-loaded module for an alternative
  • module load – load a new module
  • module unload – unload a loaded module
  • module avail – list all available modules, covering different compilers, libraries, and tools

See the module man page and our web page on modules for more details.

BG/Q Systems

A software environment management system called SoftEnv is used to control system path, and other environment variables required to run application software on the BG/Q. The first time a user logs on to BG/Q, a '.soft' configuration file will automatically be created in the user's home directory. This file will be set up with the default applications environment -- typically there will be a single line with '@default' in the file. Depending on the user's applications, further modifications to the .soft file may be necessary.

SoftEnv man pages are available in the default environment (use 'man softenv'). The command 'softenv' will list all available applications. Users wishing to gain a more complete understanding of how softenv works may read the complete softenv documentation ('man softenv-intro').

Viz Cluster

As on BG/Q systems, we use SoftEnv on the viz cluster. Because the architectures are different (login nodes and compute nodes), yet they share the same home directories, we keep the software environment setups separate by using a separate ‘.soft.cooley’ configuration file. (Cooley is the name of our current viz cluster.)

Step 5. Data

Transfer Utilities

sftp and scp

These standard utilities are available for local area transfers of small files.

Data Transfer Service

Globus: ALCF makes Globus endpoints available for data transfer to our systems. All ALCF endpoints have names beginning with “alcf#”.  For more information about this service, see Using Globus.

Step 6. How to Run a Job

For extensive information on running and queuing jobs, please visit our detailed web pages. Here's some basic information to get you started.

Cobalt

We use Cobalt, an Argonne-developed scheduler, for batch job submissions to run on compute nodes on all our systems. It is somewhat similar to PBS. The basic commands to know are

  • qsub – submit a job
  • qstat – view the status of queued and running jobs
  • qdel – kill/cancel a job

Note that when you are submitting a script to Cobalt, the script file must be executable (chmod +x myscript.sh).

XC40 Systems

You must write a script to pass to the qsub command, using the aprun command in the script to invoke your executable. Here is an example submission command:

qsub -A <project> -n <nodes> -t <walltime> --mode script –attrs mcdram=cache:numa=quad  ./myscript.sh

Here is an example script:

#!/bin/sh
echo "Starting Cobalt job script"
export n_nodes=$COBALT_JOBSIZE
export n_mpi_ranks_per_node=32
export n_mpi_ranks=$(($n_nodes * $n_mpi_ranks_per_node))
export n_openmp_threads_per_rank=4
export n_hyperthreads_per_core=2
export n_hyperthreads_skipped_between_ranks=4
# see aprun --help for more options
aprun -n $n_mpi_ranks -N $n_mpi_ranks_per_node \
  --env OMP_NUM_THREADS=$n_openmp_threads_per_rank -cc depth \
  -d $n_hyperthreads_skipped_between_ranks \
  -j $n_hyperthreads_per_core \
  --env FOO=$FOO --env BAR=$BAR : myprogram.exe myprogarg

BG/Q Systems

Direct executable, no script

qsub -A <project> -n <nodes> -t <walltime> --mode <mode> --env "FOO=a:BAR=b" ./<exe>

<project> - Assigned project short name

<nodes> - Number of nodes to use; will be rounded to fit in the nearest partition size.

<walltime> - Maximum wall clock time to allow your job to run, specified in minutes

<mode> - Number of ranks per node, c1 - 1 rank per node, c8 - 8 ranks per node. Valid values: c1, c2, c4, c8, c16, c32, c64.

<exe> - binary executable.

Submitting a script

In addition to allowing direct submission of an executable program, Cobalt on BG/Q can also handle scripts. Within your script, use the command runjob to start an execution.

qsub -A <project> -n <nodes> -t <walltime> --mode script --env "FOO=a:BAR=b" ./myscript.sh

Here is an example script:

#!/bin/sh
echo "Starting Cobalt job script"
# see runjob --help for more options
runjob --block $COBALT_PARTNAME --np $(($COBALT_JOBSIZE*16)) --ranks-per-node 16 \
  --verbose 2 --envs FOO=$FOO --envs BAR=$BAR : myprogram.exe myprogarg

Viz Cluster

You must write a script to pass to the qsub command, using the mpirun command in the script to invoke your executable. Here is an example submission command:

qsub -A <project> -n <nodes> -t <walltime> --mode script ./myscript.sh

Here is an example script:

#!/bin/sh
NODES=`cat $COBALT_NODEFILE | wc -l`
PROCS=$((NODES * 12))
mpirun -f $COBALT_NODEFILE -n $PROCS /path/to/binary/myprogram.exe

Step 7. Managing Your ALCF Account

Updating Your Contact Information

In order to communicate effectively with users and to abide by DOE regulations, we must have the most current information for each user. Please keep your personal information up to date. This is important so that we can contact you.

To update your information, please login to the ALCF accounts page. If you forget your password, please call the ALCF Service Desk at 866-508-9181. Once your identity is verified, the password will be reset and provided.

Step 8. ALCF Acknowledgement Policy

As an active Argonne Leadership Computing Facility (ALCF) user conducting research on an ALCF resource, we request your continued cooperation and compliance with the work acknowledgment policy pertaining to your computing allocation award. For guidance on acknowledgements, please refer to this policy page on our website: http://www.alcf.anl.gov/user-guides/alcf-acknowledgment-policy.

Step 9. Getting Additional Support

Training Opportunities

ALCF conducts workshops and other events to train new users, help scale existing projects, and introduce new techniques to experienced users. Click to view upcoming training events.

Contacting Us

ALCF Service Desk staff is available from 9:00 am through 5:00 pm (US Central Time), Monday through Friday for phone-based support. Email support is the preferred method of requesting assistance and covers standard hours (9:00 - 5:00) as well as emergency response for after-hours and weekend support.

Email: support@alcf.anl.gov

Telephone: 630-252-3111 or 1-866-508-9181 (Toll free, US only)