Magellan: Cloud Computing for Science

Magellan was a research and development effort to establish a nationwide scientific mid-range distributed computing and data analysis testbed. It had two sites (NERSC and ALCF) with multiple 10’s of teraflops and multiple petabytes of storage, as well as appropriate cloud software tuned for moderate concurrency.

The goal of Magellan, a project funded through the U.S. Department of Energy (DOE) Office of Advanced Scientific Computing Research (ASCR), was to investigate the potential role of cloud computing in addressing the computing needs for the DOE Office of Science (SC), particularly related to serving the needs of mid- range computing and future data-intensive computing workloads. A set of research questions was formed to probe various aspects of cloud computing from performance, usability, and cost.

To address these questions, a distributed testbed infrastructure was deployed at the Argonne Leadership Computing Facility (ALCF) and the National Energy Research Scientific Computing Center (NERSC). The testbed was designed to be flexible and capable enough to explore a variety of computing models and hardware design points in order to understand the impact for various scientific applications. During the project, the testbed also served as a valuable resource to application scientists. Applications from a diverse set of projects such as MG-RAST (a metagenomics analysis server), the Joint Genome Institute, the STAR experiment at the Relativistic Heavy Ion Collider, and the Laser Interferometer Gravitational Wave Observatory (LIGO), were used by the Magellan project for benchmarking within the cloud, but the project teams were also able to accomplish important production science utilizing the Magellan cloud resources.

Cloud computing has garnered significant attention from both industry and research scientists as it has emerged as a potential model to address a broad array of computing needs and requirements such as custom software environments and increased utilization among others. Cloud services, both private and public, have demonstrated the ability to provide a scalable set of services that can be easily and cost-effectively utilized to tackle various enterprise and web workloads. These benefits are a direct result of the definition of cloud computing: on-demand self-service resources that are pooled, can be accessed via a network, and can be elastically adjusted by the user. The pooling of resources across a large user base enables economies of scale, while the ability to easily provision and elastically expand the resources provides flexible capabilities.

Project Goals

Through this project we wanted to promote open interface specifications for clouds, as well as to determine:

  1. Are the open source cloud software stacks ready for DOE HPC science?
  2. Can DOE cyber security requirements be met within a cloud?
  3. Are the new cloud programming models useful for scientific computing?
  4. Can DOE HPC applications run eciently in the cloud? What applications are suitable for clouds?
  5. How usable are cloud environments for scientific applications?
  6. When is it cost effective to run DOE HPC science applications in a cloud?

Project Findings & Recommendations

The full report on the Magellan project can be found on the DOE web site.