FAQs for Queueing and Running on XC40

 

Where can I find the details of a job submission?

Details of the job submission are recorded in the <jobid>.cobaltlog. This file contains the qsub command and environment variables. The location of this file can be controlled with the ‘qsub --debuglog <path>’ that defaults to the same place as the .output and .error files.

Why is my job stuck in "starting" state?

If you submit a job and qstat shows it in "starting" state for 5 minutes or more, most likely your memory/numa mode selection requires rebooting some or all of the nodes your job was assigned. This process takes about 15 minutes, during which your job appears to be in the "starting" phase. When no reboots are required, the "starting" phase only lasts a matter of seconds.