Using the Job Resource Manager on BG/Q: Commands, Options and Examples
This document provides examples of how to submit jobs on our BG/Q system. It also provides examples of commands that can be used to query the status of jobs, what partitions are available, etc. For an introduction to using the job resource manager and running jobs on BG/Q, see Running Jobs on the BG/Q System.
Submit a job request
Use qsub to submit a job. Scripts and interactive jobs are not supported at this time.
Run the compiled binary exe1 with 10 nodes for a maximum of 15 minutes:
qsub -n 10 -t 15 exe1
To submit jobs to a particular queue, use qsub -q <queue_name>.
To run the compiled binary exe1 with 10 nodes for a maximum of 30 minutes in the production queue:
qsub -q prod -n 10 -t 30 exe1
Charge a job to a project
Use qsub -A <project_name> to charge a job to a particular project. If you are a member of only one project, you do not need to specify a project name.
To run the compiled binary exe1 with 10 nodes for a maximum of 15 minutes and charge the job to MyProject:
qsub -n 10 -t 15 -A MyProject exe1
To see which projects you are a member of:
You can use the environment variable "COBALT_PROJ" to set your default project. qsub -A takes precedence over COBALT_PROJ.
On Vesta, if you are a member of one project (besides your pilot project), the non-pilot project will be your default project.
Delete a job from the queue
To delete a job from the queue, use the qdel command.
Cancel job 34586:
If the job failed to cancel (indicating that the resource manager is unable to kill the mpirun's cleanly), you might try again with the force option:
qdel -f 34586
If you must forcibly delete a job, email Support at alcf.anl.gov with the job i.d. so that the necessary cleanup can be accomplished.
Query partition availability
To determine which partitions are currently available to the scheduler, use the partlist command. This command provides a list of partitions, names, queue, and state. For example:
% partlist Name Queue State MIR-00000-7BFF1-49152 prod-capability blocked MIR-04000-3BFF1-16384 prod-capability idle MIR-00000-33FF1-8192 prod-capability:backfill blocked MIR-04000-37FF1-8192 prod-capability:backfill idle MIR-44800-77FF1-4096 prod-short:backfill blocked MIR-04000-377F1-4096 prod-short:backfill idle