Job Scheduling Policies on Cooley

There are six queues for general use on Cooley (default, nox11, pubnet, pubnet-nox11, debug, and pubnet-debug).

The default and default-pubnet queues are for production use.  The nox11 and pubnet-nox11 queues are analogous to the default and pubnet queues, except that the X server is disabled for runs in these queues in order to fully dedicate the GPUs for CUDA runs.   The following policy applies to all 4 of these queues:

  • Max. runtime: 12 hours
  • Max. job size: 110 nodes (the other sixteen nodes are dedicated to debugging)
  • Max. running jobs per user: 10
  • Max. node-hours (queued and running): 1320
  • Priority: FIFO -- (jobs are run in order, with small, short jobs run on any otherwise-free nodes)

In addition, there are sixteen nodes set aside for dedicated debugging queues, debug and pubnet-debug. These are intended for short debugging and interactive visualization runs only.  They have the following scheduling policy:

  • Max. runtime: 2 hour
  • Max. job size: 16 nodes
  • Max. running jobs per user: 1
  • Priority: FIFO -- (jobs are run in order, with small, short jobs run on any otherwise-free nodes)

For jobs that require public network connectivity, you may use the queues containing "pubnet" in the name.  If your jobs do not require direct network communication with resources outside of ALCF, please use the non-pubnet queues.

For jobs that use the GPUs directly for computation (e.g. CUDA) and don't require an X sever, you may wish to use queues with "nox11" in the name, which will stop the X server that normally runs on the nodes in order to prevent any performance impact on your GPU jobs.

If you have needs not addressed by the standard queues, please send mail to support@alcf.anl.gov requesting a reservation.

We will monitor Cooley's queues and evaluate the above policies as needed. Your feedback (send e-mail to support@alcf.anl.gov) is appreciated.