Automatic Data Filtering for In Situ Workflows

Clement Mommessin
Seminar

In situ workflows contain tasks that exchange messages composed of several data fields. However, a consumer task may not necessarily need all the data fields from its producer. The user should decide whether to specialize the output of a producer task for a particular consumer and get better performance or to send more data than required by the consumer. The first option limits task portability, while the second wastes resources. In this talk, we introduce contracts for in situ tasks. A contract specifies for a producer each data field available for output and for a consumer the data fields needed as input. Comparing a producer and consumer contracts allows automatic selection of the data fields a producer has to send for that consumer at runtime, removing the need to specialize code while sending only necessary data. We integrated the contract mechanism within Decaf, a middleware for building and executing in situ workflows. We evaluated the cost and performance of data selection at runtime with both a synthetic example and a real scientific workflow coupling a molecular dynamics simulation with three different data analytics codes.

Short Bio:
Clément Mommessin is a former Master’s student of the Grenoble Alpes University, Grenoble, France in Computer Science with the major Parallel, Distributed and Embedded Systems. Clément started working as a research aide student in the MCS division of Argonne in January 2017 as a member of the Decaf project. This talk will present the results of his work during his visit. In September 2017, Clément will start his PhD thesis at INRIA Rhone-Alpes, Grenoble, France in the topic of scheduling parallel applications on heterogeneous platforms under the supervision of Denis Trystram.