Skip to main content

Rethinking High Performance Computing System Architecture for Scientific Big Data Applications

Authors: Y. Chen, C. Chen, Y. Yin, X.-H. Sun, R. Thakur, W. D. Gropp

Date: August, 2016

Venue: 14th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2016), Tianjin, China

Type: Conference

Abstract

The increasingly important data-intensive scientific discovery presents a critical question to the high performance computing (HPC) community - how to efficiently support these growing scientific big data applications with HPC systems that are traditionally designed for big compute applications? The conventional HPC systems are computing-centric and designed for computation-intensive applications. Scientific big data ap- plications have growlingly different characteristics compared to big compute applications. These scientific applications, however, will still largely rely on HPC systems to be solved. In this research, we try to answer this question with a rethinking of HPC system architecture. We study and analyze the potential of a new decoupled HPC system architecture for data-intensive scientific applications. The fundamental idea is to decouple conventional compute nodes and dynamically provision as data processing nodes that focus on data processing capability. We present studies and analyses for such decoupled HPC system architecture. The current results have shown its promising potential. Its data-centric architecture can have an impact in designing and developing future HPC systems for growingly important data- intensive scientific discovery and innovation.