Skip to main content

Projects

ChronoLog: A High-Performance Storage Infrastructure for Activity and Log Workloads

GRC-LEDFUNDEDOPEN SOURCE

HPC applications generate more data than storage systems can handle, and it is becoming increasingly important to store activity (log) data generated by people and applications. ChronoLog is a hierarchical, distributed log store that leverages physical time to achieve log ordering and reduce contention while utilizing storage tiers to elastically scale the log capacity.

Coeus: Accelerating Scientific Insights Using Enriched Metadata

GRC-LEDFUNDED

In collaboration with Sandia and Oak Ridge National Laboratories, coeus investigate the use of an active storage system to calculate derived quantities and support complex queries on scientific data (simulation and observational) as well as optimizing data placement across the storage hierarchy, with awareness of the resource limitations, to better support the scientific discovery process.

DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics

GRC-LEDFUNDEDOPEN SOURCE

Nowadays, distributed scientific workflows encounter challenges in data movement through storage systems. DaYu, by capturing the mapping of data objects to I/O operations, can uncover new insights for optimizing workflow data movement.

DTIO: A Data Task I/O Runtime

GRC-LEDFUNDED

In partnership with Argonne National Laboratory, DTIO investigates the use of a task framework for unifying complex I/O stacks and providing features such as resilience, fault-tolerance, and task replay.

Hermes: Extending the HDF Library to Support Intelligent I/O Buffering for Deep Memory and Storage Hierarchy System

GRC-LEDFUNDEDOPEN SOURCE

To reduce the I/O bottleneck, complex storage hierarchies have been introduced. However, managing this complexity should not be left to application developers. Hermes is a middeware library that automatically manages buffering in heterogeneous storage environments.

IOWarp: Advanced Data Management for Scientific Workflows

GRC-LEDFUNDEDOPEN SOURCE

IOWarp is a comprehensive data management platform designed to address the unique challenges in scientific workflows that integrate simulation, analytics, and Artificial Intelligence (AI). IOWarp builds on existing storage infrastructures, optimizing data flow and providing a scalable, adaptable platform for managing diverse data needs in modern scientific workflows, particularly those augmented by AI.

IRIS: I/O Redirection Via Integrated Storage

GRC-LEDFUNDEDOPEN SOURCE

Various storage solutions exist and require specialized APIs and data models in order to use, which binds developers, applications, and entire computing facilities to using certain interfaces. Each storage system is designed and optimized for certain applications but does not perform well for others. IRIS is a unified storage access system that bridges the semantic gap between filesystems and object stores.

LABIOS: A Distributed Label-Based I/O System

GRC-LEDFUNDED

HPC and Big Data environments have diverged over the years, resulting in diverging and even conflicting I/O requirements. Labios aims to address the challenges vital to HPC + Big Data Convergence

Optimization of Memory Architectures: A Foundation Approach

GRC-LEDFUNDED

This project establishes a foundational framework for memory performance modeling and optimization in modern architectures, utilizing simulation and real-system analysis to advance architecture designs for data-intensive applications.

StoreHub

GRC-LEDFUNDED

StoreHub is a collaborative platform designed to advance data storage research by providing a specialized infrastructure that meets the unique needs of researchers. It brings together experts handling large amounts of data, focusing on I/O performance, and developing innovative storage solutions, making it a vital resource for the community.

UniMCC: Towards A Unified Memory-centric Computing System with Cross-layer Support

GRC-LEDFUNDED

UniMCC addresses memory bottlenecks in data-centric applications with a full-stack, cross-layer system that integrates architecture, SW/HW interfaces, code generation, runtime support, and performance optimization to maximize memory-centric computing's potential.

Viper: A High-Performance I/O Framework for Transferring Deep Neural Network Models

GRC-LEDFUNDED

Within a DL workflow, exchanging DNN models through PFS may result in high model update latency and discovery latency. Moreover, model update frequency affects both training and inference performance. Viper is an I/O framework aiming to accelerate model discovery and delivery, and to find an optimal model checkpoint schedule to balance the trade-off.

WisIO: Automated I/O Bottleneck Detection via Multi-Perspective Views for HPC Workloads

GRC-LED

Explore WisIO, an automated I/O bottleneck detection tool with multi-perspective views for I/O trace data analysis. Overcoming large-scale I/O challenges, WisIO utilizes distributed computing and an extensible rule engine for tailored solutions. Elevate your I/O analysis in HPC environments with WisIO.