Skip to main content

PSA: A Performance and Space-Aware Data Layout Scheme for Hybrid Parallel File Systems

Authors: S. He, Y. Liu, X.-H. Sun

Date: November, 2014

Venue: Data Intensive Scalable Computing Systems Workshop (DISCS), in conjunction with ACM/IEEE SuperComputing 2014, New Orleans, LA, USA

Type: Workshop

Abstract

The underlying storage of hybrid parallel file systems (PFS) is composed of both SSD-based file servers (SServer) and HDD-based file servers (HServer). Unlike a traditional HServer, an SServer consistently provides improved storage performance but lacks storage space. However, most current data layout schemes do not consider the differences in performance and space between heterogeneous servers, and may significantly degrade the performance of the hybrid PFSs. In this paper, we propose PSA, a novel data layout scheme, which maximizes the hybrid PFSs performance by applying adaptive varied-size file stripes. PSA dispatches data on heterogeneous file servers not only based on storage performance but also storage space. We have implemented PSA within OrangeFS, a popular parallel file system in the HPC domain. Our extensive experiments using a representative benchmark show that PSA provides superior I/O throughput than the default and performance-aware file data layout schemes.