Data Transfer

The Blue Gene/Q will connect to other research institutions using a total of 100 Gbit/s of public network connectivity. This allows scientists to transfer datasets to and from other institutions over fast research networks such as the Energy Science Network (ESNet) and the Metropolitan Research and Education Network (MREN).

Data Transfer Node Overview

A total of 13 data transfer nodes (DTNs) are available to ALCF users (12 for Mira and one for Vesta), allowing users to perform wide and local area data transfers. Access to the DTNs is provided via the following virtual endpoints (load balanced via DNS):

Mira: miradtn.alcf.anl.gov
Vesta: vestadtn.alcf.anl.gov

The DTNs are individually addressed as dtn01.alcf.anl.gov through dtn12.alcf.anl.gov (Mira) and dtn13.alcf.anl.gov (Vesta), but for load-balancing purposes it is preferable to use the miradtn or vestadtn endpoints for all transfers. Mira DTNs are configured in stripe groups of four servers, so an individual striped file transfer will use no more than four DTNs at once.  The Vesta DTN is a single-server setup without striping.

Data Transfer Services and Utilities

Globus

Globus (http://www.globus.org) addresses the challenges faced by researchers when moving, sharing, and archiving large volumes of data among distributed sites.

With Globus, you hand-off data movement tasks to a hosted service that manages the entire operation monitoring performance and errors, retrying failed transfers, correcting problems automatically whenever possible, and reporting status to keep you informed so that you can focus on your research. Command line and Web-based interfaces are available. The command line interface, which requires only ssh to be installed on the client, is the method of choice for script-based workflows. Globus also has a REST-style transfer API (https://transfer.api.globusonline.org/).

After you register, simply use the ALCF2 endpoint "alcf#dtn_mira" as well as other sources or destinations. You can activate the ALCF2 endpoint using your ALCF credentials. Alternatively, you can use DOEGrids credentials for your transfer. For instructions, see http://www.globus.org/beyondbasics. The ALCF endpoints listed are ALCF's GridFTP server nodes, which are tuned especially for WAN data movement tasks.

Globus Connect Personal (https://www.globus.org/globus-connect-personal/) allows you to add your laptop or desktop as an endpoint in Globus in just two steps. After you set up Globus Connect Personal, you can use Globus to transfer files to and from your computer.

GridFTP

GridFTP enables data transfer between such trusted sites as NERSC and ORNL as well as local nodes at ALCF.  The simplest way to use GridFTP is to use the Globus web service as described above.  However, if you prefer working from the command line, or wish to do striped file transfers, you may also use GridFTP directly by using globus-url-copy.

globus-url-copy -p 4 -tcp-bs 4M gsiftp://dtn01.nersc.gov/<path>/foo.tar gsiftp://miradtn.alcf.anl.gov/<path>/foo.tar

SFTP and SCP

These standard utilities are available for local area transfers of small files; they are not recommended for use with large data transfers due to poor performance and excess resource utilization on the login nodes.

HSI and HTAR

HSI and HTAR allow users to transfer data to and from HPSS.