► USER TIP: For an introduction to using ALCF resources, view presentations on Getting Started on Theta and Cooley or Getting Started on Mira
Step 1. Request an ALCF Project
Step 2. Get an ALCF User Account
Step 3. Logging in to an ALCF Resource
Step 5. Data
Step 6. How to Run a Job
Step 7. Managing Your ALCF Account
Step 8. ALCF Acknowledgement Policy
Step 9. Getting Additional Support
You must get a project on the system you will be using before you can proceed. If you do not have a project, please visit www.alcf.anl.gov/user-guides/how-get-allocation to establish one.
In order to use the resources at ALCF, users need to have an account on our systems. The following steps provide some guidelines on how to request for a user account at the ALCF.
- Please go to https://accounts.alcf.anl.gov/ and click on 'Request A New account'.
- Input your email address, then click 'Continue'.
- You will receive an ANL Verification code from email@example.com
- Enter the verification code on the Request A New Account page
- Follow the steps outlined within the account request form, making sure ALL personal information is filled in thoroughly and accurately.
- All fields are mandatory unless marked optional
- When you get to the field that specifies 'Account Type', refer to this page to choose the appropriate type.
- After you have finished filling out the form, read through the Argonne National Lab Computer User Agreement, then check the box and click 'Create Account'
The logins, also known as the front-end nodes, are the nodes that users access through the ssh command to the ALCF systems. These nodes allow for interactive activities such as editing files and compiling. The compute nodes are not directly accessible to the user, but are where users code is executed when submitted with the qsub command. An important aspect of the Blue Gene system to note is that the hardware and operating systems of the login and compute nodes is different.
Activating Your Cryptocard
- After you have received your Cryptocard token and before you can use it, you must call the ALCF Service Desk (630-252-3111 or 866-508-9181) for us to verify your identity and activate the token. If you do not perform this step, you will not be able to log on to ALCF resources using your Cryptocard token.
Logging In Using Your Cryptocard
- To log in to an ALCF resource, for example Vesta, from a Unix machine, ssh to vesta.alcf.anl.gov: ssh vesta.alcf.anl.gov. If your username on your local machine is different from your username on ALCF resources, you will need specify your ALCF account username within the SSH command. This can be done in one of two ways: by using the -l option or by prepending your username to the hostname.
- ssh -l <alcf_username> vesta alcf.anl.gov
- ssh <alcf_username>@vesta.alcf.anl.gov
- When prompted for a password, the user must provide the PIN as well as the one-time password (Cryptocard password obtained by pressing the button on your token) to authenticate.
If you have any problems logging in using the Cryptocard please refer to the help section for troubleshooting my cryptocard or call the ALCF Service Desk (630-252-3111 or 866-508-9181).
Setting Up Your Shell
Once you are logged into the XC40, BG/Q, or viz cluster system, you will be sitting at a UNIX shell prompt in your home ($HOME) directory. If you have never used UNIX before, you will need to learn some basic commands before you will be able to submit jobs. A number of links to tutorial information may be found at the Unix Guru Universe Beginners' Pages (external link). Your account will be set up with the default shell bash, unless you requested a different shell in your account request.
We also provide tcsh and zsh. You can change your shell by logging in to your account web page. Scroll down to the "Unix Shell" section and choose your new shell.
Setting Up Your Software Environment
We use the modules environment management system. There is a default set of modules set up for all users which allows basic compilation of applications to run on the compute nodes. The basic commands to know are
- module list – display the modules you currently have loaded
- module swap – switch a currently-loaded module for an alternative
- module load – load a new module
- module unload – unload a loaded module
- module avail – list all available modules, covering different compilers, libraries, and tools
See the module man page and our web page on modules for more details.
A software environment management system called SoftEnv is used to control system path, and other environment variables required to run application software on the BG/Q. The first time a user logs on to BG/Q, a '.soft' configuration file will automatically be created in the user's home directory. This file will be set up with the default applications environment -- typically there will be a single line with '@default' in the file. Depending on the user's applications, further modifications to the .soft file may be necessary.
SoftEnv man pages are available in the default environment (use 'man softenv'). The command 'softenv' will list all available applications. Users wishing to gain a more complete understanding of how softenv works may read the complete softenv documentation ('man softenv-intro').
As on BG/Q systems, we use SoftEnv on the viz cluster. Because the architectures are different (login nodes and compute nodes), yet they share the same home directories, we keep the software environment setups separate by using a separate ‘.soft.cooley’ configuration file. (Cooley is the name of our current viz cluster.)
sftp and scp
These standard utilities are available for local area transfers of small files.
Data Transfer Service
Globus: ALCF makes Globus endpoints available for data transfer to our systems. All ALCF endpoints have names beginning with “alcf#”. For more information about this service, see Using Globus.
For extensive information on running and queuing jobs, please visit our detailed web pages. Here's some basic information to get you started.
We use Cobalt, an Argonne-developed scheduler, for batch job submissions to run on compute nodes on all our systems. It is somewhat similar to PBS. The basic commands to know are
- qsub – submit a job
- qstat – view the status of queued and running jobs
- qdel – kill/cancel a job
Note that when you are submitting a script to Cobalt, the script file must be executable (chmod +x myscript.sh).
You must write a script to pass to the qsub command, using the aprun command in the script to invoke your executable. Here is an example submission command:
qsub -A <project> -n <nodes> -t <walltime> --mode script –attrs mcdram=cache:numa=quad ./myscript.sh
Here is an example script:
#!/bin/sh echo "Starting Cobalt job script" export n_nodes=$COBALT_JOBSIZE export n_mpi_ranks_per_node=32 export n_mpi_ranks=$(($n_nodes * $n_mpi_ranks_per_node)) export n_openmp_threads_per_rank=4 export n_hyperthreads_per_core=2 export n_hyperthreads_skipped_between_ranks=4 # see aprun --help for more options aprun -n $n_mpi_ranks -N $n_mpi_ranks_per_node \ --env OMP_NUM_THREADS=$n_openmp_threads_per_rank -cc depth \ -d $n_hyperthreads_skipped_between_ranks \ -j $n_hyperthreads_per_core \ --env FOO=$FOO --env BAR=$BAR : myprogram.exe myprogarg
Direct executable, no script
qsub -A <project> -n <nodes> -t <walltime> --mode <mode> --env "FOO=a:BAR=b" ./<exe>
<project> - Assigned project short name
<nodes> - Number of nodes to use; will be rounded to fit in the nearest partition size.
<walltime> - Maximum wall clock time to allow your job to run, specified in minutes
<mode> - Number of ranks per node, c1 - 1 rank per node, c8 - 8 ranks per node. Valid values: c1, c2, c4, c8, c16, c32, c64.
<exe> - binary executable.
Submitting a script
In addition to allowing direct submission of an executable program, Cobalt on BG/Q can also handle scripts. Within your script, use the command runjob to start an execution.
qsub -A <project> -n <nodes> -t <walltime> --mode script --env "FOO=a:BAR=b" ./myscript.sh
Here is an example script:
#!/bin/sh echo "Starting Cobalt job script" # see runjob --help for more options runjob --block $COBALT_PARTNAME --np $(($COBALT_JOBSIZE*16)) --ranks-per-node 16 \ --verbose 2 --envs FOO=$FOO --envs BAR=$BAR : myprogram.exe myprogarg
You must write a script to pass to the qsub command, using the mpirun command in the script to invoke your executable. Here is an example submission command:
qsub -A <project> -n <nodes> -t <walltime> --mode script ./myscript.sh
Here is an example script:
#!/bin/sh NODES=`cat $COBALT_NODEFILE | wc -l` PROCS=$((NODES * 12)) mpirun -f $COBALT_NODEFILE -n $PROCS /path/to/binary/myprogram.exe
Updating Your Contact Information
In order to communicate effectively with users and to abide by DOE regulations, we must have the most current information for each user. Please keep your personal information up to date. This is important so that we can contact you.
To update your information, please login to the ALCF accounts page. If you forget your password, please call the ALCF Service Desk at 866-508-9181. Once your identity is verified, the password will be reset and provided.
As an active Argonne Leadership Computing Facility (ALCF) user conducting research on an ALCF resource, we request your continued cooperation and compliance with the work acknowledgment policy pertaining to your computing allocation award. For guidance on acknowledgements, please refer to this policy page on our website: http://www.alcf.anl.gov/user-guides/alcf-acknowledgment-policy.
ALCF conducts workshops and other events to train new users, help scale existing projects, and introduce new techniques to experienced users. Click to view upcoming training events.
ALCF Service Desk staff is available from 9:00 am through 5:00 pm (US Central Time), Monday through Friday for phone-based support. Email support is the preferred method of requesting assistance and covers standard hours (9:00 - 5:00) as well as emergency response for after-hours and weekend support.
Telephone: 630-252-3111 or 1-866-508-9181 (Toll free, US only)