The queuing system used at ALCF is Cobalt. Cobalt has two ways to queue a run: the basic method and the script method.
In the basic method, enter the information needed for mpirun and Cobalt will do the mpirun when the job starts. These are the most commonly used qsub options (for a complete list, please run "man qsub").
-A Project - project (-A YourProject) -q queue - queue (-q R.workshop) -t time - running time (-t 5 for 5 minutes, -t 01:10:20 for 1 hr 10 min 20 sec) includes partition boot - give at least 5 min -n NN - number of nodes (-n 64 for 64 nodes, each node is 1 to 64 MPI tasks depending on how the mode flag is set) --mode script/c1/c2/c4/c8/c16/c32/c64 - running mode (default c1) (script for script mode, otherwise cN causes the nodes to run N processes per node) --proccount - number of MPI tasks (ranks) for the run (default is computed from -n and --mode) -O Name - name your job and stdout/stderr (-O Job1) -i file - give a file name to be used for stdin --env VAR1=1:VAR2=2:… - specify required environment variables
NOTE: Remember to give all options before the executable name.
qsub -A YourProject -q R.workshop -n 256 --mode c16 --proccount 1024 -t 30 \ --env MYVAR=value1 -i inputdata -O Project1_out program.exe progarg1
Alternatively, Cobalt can run a job with a script. The syntax is slightly different than a PBS-style script. Follow this link for the documentation on Cobalt scripting.
The Blue Gene/Q platform now provides users with the ability to allocate and boot blocks with a Cobalt resource allocation within their script. Unlike the Blue Gene/P platform, booting resources with the Blue Gene/Q is separate from running jobs. For For ensemble jobs, the --disable_preboot flag must be added to the qsub submission line.
Additionally, the get-bootable-blocks utility provides a list of available blocks. This command takes a parent block as an argument, and also accepts --size and --geometry flags as constraints on the blocks returned. Please visit Cobalt's project website for more information on ensemble jobs: http://trac.mcs.anl.gov/projects/cobalt/wiki/BGQUserComputeBlockControl.
Queue Names and Scheduling Policy
Queue names and operations are described on the Job Scheduling Policy page.
You can find active project names that your account is associated with by running the command:
If an account is associated with more than one project, a job must be submitted by using a specific project name using -A, or by setting the environment variable COBALT_PROJ.
Submitted Job with the Wrong Arguments
If you submit a job with the wrong arguments, you can modif without deleting it and resubmitting it. Most settings can be changed using qalter.
Usage: qalter [-d] [-v] -A <project name> -t <time in minutes> -e <error file path> -o <output file path> --dependencies <jobid1>:<jobid2> -n <number nodes of> -h --proccount <processor count> -M <email address> --mode script/c1/c2/c4/c8/c16/c32/c64 <jobid1> <jobid2>
Note: To change the queue, use qmove.
Usage: qmove <queue name> <jobid> <jobid>
Changing Executable after Job Submission
When a job is submitted via qsub, Cobalt records the path to the executable or script, but it does not make a copy. As a result, if the executable or script is modified when there is a deletion or modification, it will affect any jobs already submitted that use that executable. To avoid confusion it is generally best to avoid making changes after job submission.
Holding and Releasing Jobs
To hold a job (prevent from running), use qhold. This will put the job in the "user_hold" state.
To release a job in a user hold (user_hold) state, use qrls:
A job may also be put into a user hold immediately upon submission by passing qsub the -h flag;
qsub -n 512 -t 120 -A MyProject -h myExe
For jobs in the dep_hold or dep_fail state, please see the section on job dependencies
Jobs in the state admin_hold may only be released by a system administrator.
Jobs may temporarily enter the state maxrun_hold if the user has reached the limit of per-user running jobs in a particular queue. No action is required; as running jobs complete, jobs in the maxrun_hold state will be automatically changed back to queued and eligible to run.
To submit a job that waits until another job or jobs have completed, use the dependencies argument to qsub. For example, to submit a job that depends on job 12345,
qsub -q prod -n 512 -t 10 -A yourproject --dependencies 12345 a.out
For multiple dependencies, list and separate with colons:
qsub -q prod -n 512 -t 10 -A yourproject --dependencies 12345:12346 a.out
Jobs submitted with dependencies will remain in the state dep_hold until all the dependencies are fulfilled, then will proceed to the state queued.
NOTE: In the event any of the dependencies do not complete successfully (nonzero exit status), the job will instead go into the state dep_fail. To manually release a job that is in either dep_hold or dep_fail:
qrls --dependencies <jobid>
or alternatively change the job's dependencies setting to "none":
qalter --dependencies none <jobid>
Customizing the Output of Qstat
Default fields displayed by the qstat command may be changed by setting the QSTAT_HEADER environment variable.
> qstat JobID User WallTime Nodes State Location ======================================================= 42342 user1 00:15:00 16 user hold None 45273 user2 00:35:00 1024 queued None ... > export QSTAT_HEADER=JobId:JobName:User:WallTime:RunTime:Nodes:Mode:State:Queue > qstat JobId JobName User WallTime RunTime Nodes Mode State Queue =================================================================================== 42342 - user1 00:15:00 N/A 16 smp user hold short 45273 - user2 00:35:00 N/A 1024 smp queued medium
One may specify column headers via the --header flag to qstat.
Available field names can be seen by entering "qstat -fl <jobid>" for any current jobid.
Redirecting Standard Input
To redirect the standard input to a job, do not use the '<' redirection operator on the qsub command line. This simply redirects standard input to qsub, not the job itself. Instead, use the qsub option "-i".
# The wrong way qsub -q queuename -t 10 -n 64 a.out < my_input_file.dat # The right way qsub -q queuename -t 10 -n 64 -i my_input_file.dat a.out
The sbank database is updated hourly. This means transactions against your account can take up to an hour before they show up.
Submitting into Backfill Partitions
Sometimes the scheduler will try to clear up room for a large job. During these times, although there may not be many jobs running, the new jobs are not being scheduled as expected.
At such times, backfill partitions may be available. For instance, suppose that 16 racks are being drained to allow a 16-rack job to run. Of the 16 racks, perhaps eight are empty and the other eight are running an eight-rack job that has two hours of wall time left. This allows the opportunity to run a two-hour eight-rack job in the backfill here.
To discover available backfill, run the partlist command.
> partlist Name Queue State Backfill Geometry =========================================================================================================================== [...] MIR-00000-7BFF1-49152 prod-capability:testing:backfill:R.pm busy - 8x12x16x16x2 MIR-00000-77FF1-32768 prod-capability:testing:backfill:R.pm blocked (MIR-00000-7BFF1-49152) - 8x8x16x16x2 MIR-00000-7BFF1-0100-32768 prod-capability:testing:backfill:R.pm blocked (MIR-00000-7BFF1-49152) - 8x8x16x16x2 MIR-04000-7BFF1-32768 prod-capability:testing:backfill:R.pm blocked (MIR-00000-7BFF1-49152) - 8x8x16x16x2 MIR-00000-3BFF1-24576 prod-capability:testing:backfill:R.pm blocked (MIR-00000-7BFF1-49152) - 4x12x16x16x2 [...]
In this example, a 4K-, 8K-, or 16K-node job with a maximum wall time of 45 minutes can be run during this backfill. The backfill times will not always be identical and will depend on the mix of jobs on the partitions that are being drained.
Submitting to a Specific Partition
In rare cases, there may be a need to target a specific hardware partition. This may be accomplished using "--attrs location=".
qsub -t 10 -n 8192 --attrs location=MIR-00000-333F1-2048 myprogram.exe
This will force the job to run on that specific location. Should that location become unschedulable, for instance, due to a failed node, the job will not be allowed to run anywhere else, without resetting the location attribute.
Running with a Group of Users
Sometimes it is useful to allow other users to run Cobalt commands on a given job such as qhold, qrls, or qdel. A list of users can be allowed to run commands on your job by submitting a list of users to qsub, cqsub, or qalter using the flag --run_users. Specified users need not be in the same project under which the job was submitted.
qsub -A FellowShipOTR -n 512 -t 1:00 --run_users frodo:sam:pippin ./council
As a convenience, all users belonging to the project under which a job was submitted can be added to a list of users that may control a job by using the --run_project flag.
Users who have been added to the list can run any command that the job-submitter could run on a job. This includes, qhold, qrls, qalter, and qdel.
Group Running and File System Groups
While setting this list of users allows any of the listed users to run Cobalt commands on a job, it does not do anything about the permissions of any files involved with the job. Those must be handled by the user(s) setting appropriate permissions on their directories to allow users in their group to read and write files as appropriate. If your project needs a group on the file system to share files or a user needs to be added, email User Services (firstname.lastname@example.org).
For more information on Cobalt commands, their options, consult the manpages on the system. The same information may be found online in Cobalt's Command Reference.