An Implementation and Evaluation of Memory-based Checkpointing (Poster Presentation)
Authors: H. Jin, X.-H. Sun, B. Xie, Y. Chen
Date: November, 2009
Venue: The ACM/IEEE SuperComputing Conference(SC'09), Portland, OR, USA
Type: Conference
Abstract
- Review of the state of art of memory-based checkpointing. - Reliability Analysis of memory-based checkpointing - Failure-aware Node matching -Design and Implementation -Flexible combination between disk- and memory-based ckpt. -Comprehensive Evaluation. -Future work Implementation on other checkpointing system. Implementation on coordinated Checkpointing. Dynamic node matching with predicted memory usage, job, etc. RES: Reliable, Efficient, Scalable Checkpointing Environment.