Skip to main content

Performance-Aware Data Placement in Hybrid Parallel File Systems

Authors: S. He, X.-H. Sun, B. Feng, K. Feng

Date: August, 2014

Venue: 14th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), Dalian, China

Type: Conference

Abstract

Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data place- ment (PADP) strategy to enable efficient data layout in hybrid PFSs. The basic idea of PADP is to dispatch data on different file servers with adaptive varied-size file stripes based on the server storage performance. By using an effective data access cost model and a linear programming optimization method, the appropriate stripe sizes for each file server are determined effectively. We have implemented PADP within OrangeFS, a widely used parallel file system in HPC domain. Experimental results of representative benchmark show that PADP can significantly improve the I/O performance of hybrid PFSs.