Storage, Scheduling and User Interfacing Challenges in Distributed Systems

Ketan Maheshwari
Seminar

Modern many-task applications running over distributed systems face three coexisting challenges: 1) Data Storage and Retrieval; 2) Task Scheduling; and 3) User Interfacing.

As distributed systems such as federated campus- and nation-wide computing infrastructure, and private-public clouds proliferate, the need to map applications effectively to these infrastructures poses significant challenges.

In particular, our current research addresses (denoted by A#) the following challenges (denoted by C#):

C1. How to efficiently use the available and auxiliary storage space to retrieve and store data for processing?
A1. We attempt to understand the nature of a wide-area cloud implementation. We analyze and evaluate research and commercial storage solutions over Amazon cloud. We demonstrate application porting in such environments via the Swift framework.

C2. How to schedule tasks from a single application on to distributed systems such that the computation is steered based on resource and network capability?
A2. We use two strategies to address the above challenge: 1) overlapping data movement and execution via superscalar-style pipelines and 2) utilizing a scheduling and modeling scheme specifically designed for wide-area distributed
task scheduling.

C3. How to provide application composition, execution monitoring, runtime resource provisioning for distributed systems?
A3. We address this challenge by: 1) effectively integrating the front-end features of the Galaxy portal with the Swift runtime as backend and 2) designing and prototyping a dynamic resource provisioning suite for Amazon cloud.

We believe a combination of above efforts enables a powerful distributed computing environment for science users.