Darshan is a lightweight I/O instrumentation library that can be used to investigate the I/O behavior of production applications. It records statistics, such as the number of files opened, time spent performing I/O, and the amount of data accessed by an application.
Mira, Cetus, and Vesta
When a Darshan-enabled job completes, it will generate a single output file containing I/O characterization results. Each output file is placed in the following directory:
- Mira or Cetus: /gpfs/mira-fs0/logs/darshan/mira/<YEAR>/<MONTH>/<DAY>
- Vesta: /gpfs/vesta-fs0/logs/darshan/vesta/<YEAR>/<MONTH>/<DAY>
The name of the output file will be in the format:
A graphical summary of I/O behavior can be generated using the darshan-job-summary.pl utility. This utility is installed on Mira, Vesta and the Cooley analytics cluster. In order to use the utility on these machines, you must first add the SoftEnv key +darshan to your ~/.soft.cooley file or ~/.soft file, respectively (in case you do not have it) and run the "resoft" command. The following example shows how to execute the utility.
# on Mira and Cooley login node: darshan-job-summary.pl /gpfs/mira-fs0/logs/darshan/mira/carns_my-app_id114525_7-27-58921_19.darshan.gz --output ~/job-summary.pdf
# on Vesta login node: darshan-job-summary.pl /gpfs/vesta-fs0/logs/darshan/vesta/carns_my-app_id114525_7-27-58921_19.darshan.gz --output ~/job-summary.pdf
The entire contents of the output file can be translated into text format for more detailed analysis using the following command, which is available on Mira, Vesta, and Cooley:
# on Mira or Cooley: darshan-parser /gpfs/mira-fs0/logs/darshan/mira/carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt
# on Vesta: darshan-parser /gpfs/vesta-fs0/logs/darshan/vesta/carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt
Note: The resulting text file will be verbose. To interpret its contents, use the guidelines in the Guide to Darshan-parser Output.
Disabling Darshan on Mira or Vesta
Disabling is discouraged on Darshan unless you have a specific problem or have been instructed by the ALCF support team to do so. Disabling Darshan limits the ALCF’s ability to assist in supporting your application.
Darshan can be disabled by setting the DARSHAN_DISABLE=1 environment variable. If this variable is set at compile time, then Darshan instrumentation will not be included in your executable at all. It can also be used at run time (in your job submission) to deactivate Darshan on a case-by-case basis for existing executables.
Possible Problems on Mira or Vesta
Darshan will not produce output files in the following scenarios:
- Use of any language besides C, C++, or FORTRAN
- Use of non-standard MPI libraries or linkers
- Use of MPI profilers
- Darshan defers to any other tool that uses the PMPI profiling interface
- Use of dynamic linking
- Job did not call MPI_Finalize(). Reasons may include:
- Job hit wall time limit
- Abnormal termination
- The executable is not an MPI program
In such cases, contact ALCF Support for help. Depending on your situation, it may still be possible to use Darshan.
Darshan is not automatically enabled for all jobs on Cooley. Unlike Mira and Vesta, all applications on Cooley are dynamically linked by default, which means that Darshan must be loaded at runtime using the LD_PRELOAD environment variable. In order to instrument a job on Cooley, you must first add the SoftEnv key +darshan to your ~/.soft.cooley file and run the “resoft” command. Then add the following to the mpirun command line in your job script:
# within Cooley job script mpirun --env LD_PRELOAD=$DARSHAN_PRELOAD -np <number of processes> -f $COBALT_NODEFILE ./app.exe <arguments>
After your job completes, you can find the Darshan output file in the following directory:
Note the path component tukey for data generated on Cooley.
The same tools described in the Mira and Vesta documentation can be used to interpret Darshan output files generated on Cooley.