Improving Metadata Operation Performance on a Replicated Object Storage System

Ellis Wilson
Seminar

Abstract: In scientific computing highly concurrent bursts of writing create unique challenges for parallel file system designers. In order to fulfill this need, file systems are being geared towards large scales and commodity storage. A common characteristic of parallel file systems is the object storage abstraction layer, a software implementation of the T10 OSD specification. Past work has established replicated object storage systems as effective with highly concurrent bandwidth bound large reads and writes. However, it has not been shown such a system performs well with highly concurrent latency bound metadata operations, which are often quite small.

To improve performance for the latter type of accesses, we have chosen to implement B-trees designed to exist within the object storage abstraction layer. Because of their efficacy in secondary storage and successful preliminary results we believe they will successfully improve performance for metadata operations. A number of challenges still exist, including the tuning of B-trees to the specific memory hierarchy in question, and difficulties involved when interfacing them with caches, which can further improve the performance of latency bound metadata operations.

Bio: Ellis Wilson III is currently an intern at Argonne National Laboratory and a PHD student at the Pennsylvania State University. He received his undergraduate degree from LaSalle University. His research interests include the simulation of highly parallel environments, multi-level cache replacement and partitioning policies, parallel file systems and their underlying metadata storage structures. He is a member of the IEEE and ACM. More details about Ellis Wilson are available at http://www.cse.psu.edu/~ehw111.