ATP and STAT are tools to debug abnormal program terminations such as segfaults. ATP (Abnormal Termination Processing) monitors a program while it runs. If the program crashes, ATP will invoke STAT (the Stack Trace Analysis Tool) to merge the stack backtraces of the application processes to an output file "atpMergedBT.dot". This merged stack backtrace file may then be visualized using STAT's visualization tool, stat-view.
Using ATP with stat-view
When you try to run you get a segfault. After running, the job's stderr file (which defaults to $COBALT_JOBID.error) contains:
user@thetalogin6:~> cat $COBALT_JOBID.error _pmiu_daemon(SIGCHLD): [NID 03834] [c7-1c2s14n2] [Sat Aug 18 03:21:19 2018] \ PE RANK 30 exit signal Segmentation fault [NID 03834] 2018-08-18 03:21:19 Apid 4938801: initiated application termination
ATP and stat-view can be used to look into the segfault.
To use ATP, the ATP module should be loaded before linking your application . By default it is loaded on Theta, but to verify this, run module list, and check that the atp module is loaded.
user@thetalogin6:~> module list Currently Loaded Modulefiles: 1) modules/22.214.171.124 2) intel/126.96.36.199 3) craype-network-aries ... 16) atp/2.1.2 17) perftools-base/7.0.2 ...
Running the code
Next, the environment variable ATP_ENABLED must be set in the job script to enable ATP.