HAS: Heterogeneity-Aware Selective Data Layout Scheme for Parallel File Systems on Hybrid Servers
Authors: S. He, X.-H. Sun, A. Haider
Date: May, 2015
Venue: 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS'15), Hyderabad, India
Type: Conference
Abstract
Hybrid parallel file systems (PFS), consisting of multiple HDD and SSD I/O servers, provide a promising design for data intensive applications. The efficiency of a hybrid PFS relies on the file's data layout. However, most current layout strategies are designed and optimized for homogeneous servers. Using them directly in a hybrid PFS neither addresses the heterogeneity of servers nor the varying access patterns of applications, making hybrid PFSs disappointingly inefficient. In this paper, we propose HAS, a novel heterogeneity-aware selective data layout scheme for hybrid PFSS. HAS alleviates the inter-server load imbalance through skewing data distribution on heterogeneous servers based on their storage performance. To largely improve the entire system's I/O efficiency, HAS adaptively selects the optimal data layout from three typical candidates according to the application's data access patterns, based on a newly developed selection and distribution algorithm. We have implemented HAS within OrangeFS to provide efficient data distribution for data-intensive applications. Our extensive experiments validate that HAS significantly increases the I/O throughput of hybrid PFSs, compared to existing data layout optimization methods.