Using HPSS with Theta
HPSS is a data archive and retrieval system that manages large amounts of data on disk and robotic tape libraries. It provides hierarchical storage management services that allow it to migrate data between those storage platforms.
HPSS is currently configured with a disk and tape tier. The disk tier has a capacity of 1.2PB on a DataDirect Networks SFA12K-40 storage array. By default, all archived data is initially written to the disk tier. The tape tier consists of 3 SpectraLogic T950 robotic tape libraries containing a total of 72 LTO6 tape drives with total uncompressed capacity 64 PB. Archived data is migrated to the tape tier at regular intervals, then deleted from the disk tier to create space for future archives.
Access to HPSS is provided by various client components. Currently, ALCF supports access through two command-line clients, HSI and HTAR. These are installed on the login nodes of Theta and Cooley. In order for the client to authenticate with HPSS, the user must have a keytab file that should be located in their home directory under subdirectory .hpss. The file name will be in the format .ktb_
HSI General Usage
Before you can use HSI on XC40 systems such as Theta, you must load a module:
module load hsi
HSI can be invoked by simply entering hsi at your normal shell prompt. Once authenticated, you will enter the hsi command shell environment:
> hsi [HSI]/home/username->
You may enter "help" to display a brief description of available commands.
If archiving from or retrieving to home, grand or eagle you must disable the Transfer Agent. -T off
[HSI]/home/username-> put mydatafile # same name on HPSS [HSI]/home/username-> put local.file : hpss.file # different name on HPSS [HSI]/home/username-> put -T off mydatafile
[HSI]/home/username-> get mydatafile [HSI]/home/username-> get local.file : hpss.file [HSI]/home/username-> get -T off mydatafile
Most of the usual shell commands will work as expected in the HSI command environment.
For example, checking what files are archived:
[HSI]/home/username-> ls -l
And organizing your archived files:
[HSI]/home/username-> mkdir dataset1 [HSI]/home/username-> mv hpss.file dataset1 [HSI]/home/username-> ls dataset1 [HSI]/home/username-> rm dataset1/hpss.file
It may be necessary to use single or double quotes around metacharacters to avoid having the shell prematurely expand them.
[HSI]/home/username-> get *.c will not work, but [HSI]/home/username-> get "*.c" will retrieve all files ending in .c.
Following normal shell conventions, other special characters in filenames such as whitespace and semicolon also need to be escaped with "\" (backslash). For example:
[HSI]/home/username-> get "data\ file\ \;\ version\ 1"
retrieves the file named "data file ; version 1".
HSI can also be run as a command line or embedded in a script as follows:
hsi -O log.file "put local.file"
HTAR General Usage
HTAR is a tar-like utility that creates tar-format archive files directly in HPSS. It can be run as a command line or embedded in a script.
htar -cf hpssfile.tar localfile1 localfile2 localfile3
htar -xf hpssfile.tar localfile2
Note: - On Theta you must first load the HSI module to make HSI and HTAR available. "module load hsi" - The current version of HTAR has a 64GB file size limit as well as a path length limit. The recommended client is HSI
In addition, HPSS is accessible through the Globus endpoint alcf#dtn_hpss. As with HSI and HTAR, you must have a keytab file before using this endpoint. For more information on using Globus, please see Using Globus.
Keytab File Missing
If you see an error like this:
*** HSI: (KEYTAB auth method) - keytab file missing or inaccessible: /
Error - authentication/initialization failed
it means that your account is not enabled to use the HPSS yet. Please contact support to have it set up.