Skip to main content

IOWarp: Advanced Data Management Platform for AI-Augmented Scientific Workflows

GRC-ledFundedOpen SourceTutorial

IOWarp is a $5 million NSF-funded platform (Award #2411318, 2024-2029) that provides proven infrastructure for intelligent I/O orchestration in scientific computing. IOWarp is a comprehensive data management platform designed to address the unique challenges in scientific workflows that integrate simulation, analytics, and Artificial Intelligence (AI). IOWarp builds on existing storage infrastructures, optimizing data flow and providing a scalable, adaptable platform for managing diverse data needs in modern scientific workflows, particularly those augmented by AI.


Project Scope and Vision

Project Goals

IOWarp focuses on:

  • Enhancing data exchange and transformation across scientific workflows.
  • Reducing data access latency with advanced storage systems.
  • Developing an open-source, community-driven framework that supports adaptability and innovation.

Vision

IOWarp envisions a modular and flexible architecture that adapts to the data demands of scientific research, particularly in High-Performance Computing (HPC). This platform aligns with NSF's focus on sustainable, adaptable solutions that can support next-generation scientific workflows.


Key Challenges in Scientific Data Management

  1. Data Heterogeneity: Managing a variety of data formats across workflow stages.
  2. Data Scale: Addressing the rapidly increasing volume and velocity of data.
  3. Data Access Speed: Overcoming limitations in I/O speed for real-time analytics.
  4. Data Integrity: Ensuring quality and consistency across storage and access points.
  5. Resource Utilization: Optimizing storage and compute resources to reduce costs and environmental impact.
  6. Interoperability: Enabling seamless data transfer across workflow stages and computing paradigms.

From Current State to IOWarp: The Transformation

IOWarp transforms how scientific data flows through modern HPC systems by introducing intelligent orchestration across the entire storage hierarchy.

Current State: Traditional PFS Architecture

Current State

Traditional parallel file system architecture with limited data management capabilities

Future State: IOWarp-Enhanced Architecture

Future State with IOWarp

IOWarp-enhanced architecture with intelligent data orchestration and AI integration

Traditional systems rely on manual data management and static storage hierarchies, leading to inefficient data movement and underutilized resources. IOWarp introduces intelligent, automated data orchestration that adapts to workflow demands, dramatically improving performance and resource utilization.


Architecture

Modular Architecture

IOWarp's architecture comprises several core components designed to handle various aspects of data flow, integrity, and interoperability.

IOWarp Architecture

Technical Architecture

This section describes the core components of IOWarp's data management platform and their functionality within scientific workflows.

Content Assimilation Engine (CAE)

The Content Assimilation Engine (CAE) transforms diverse format-specific data into IOWarp's unified data representation format, Content, optimized for data transfer. The CAE:

  • Integrates with data sources (e.g., Globus, S3, PFS).
  • Applies data layout and semantic tagging, preserving context across workflow stages.
  • Exports data back to repositories post-processing, ensuring data longevity.

Content Transfer Engine (CTE)

The Content Transfer Engine (CTE) manages efficient data flow across workflow stages and storage systems.

Content Transfer Engine with LabIOS

CTE architecture with LabIOS integration for multi-tiered storage management

Key features include:

  • Multi-tiered I/O: Supports interactions with advanced storage hardware, including NVMe SSDs and CXL-powered devices.
  • GPU Direct I/O: Directly transfers data between GPUs for faster model training and inference.
  • Secure Transfer Protocols: Ensures data security during transfers.

Content Exploration Interface (CEI)

The Content Exploration Interface (CEI) enables advanced data querying and retrieval, incorporating tools like:

Content Exploration Interface Architecture

CEI architecture showing natural language query processing and integration

  • WarpGPT: A language model-driven interface for complex scientific queries, capable of handling anomaly detection, mathematical operations, and user-defined extensions.
  • FAIR Compliance: Implements principles to support Findable, Accessible, Interoperable, and Reusable data within scientific workflows.

Platform Plugins Interface (PPI)

The Platform Plugins Interface (PPI) extends IOWarp's functionality, allowing integration with external services, such as:

  • Global Schedulers (e.g., Slurm): For resource and task allocation.
  • Workflow Managers (e.g., Pegasus): For task orchestration and system telemetry.
  • Custom Libraries: Allows integration with libraries for data tracing, encryption, and transformations.

High-Level Data Flow in IOWarp

The data flow within IOWarp follows an organized pipeline from acquisition and transformation to storage and retrieval. Here's a typical data path:

  1. Data Ingestion via the Content Assimilation Engine.
  2. Data Storage Optimization through the Content Transfer Engine and hardware-optimized storage.
  3. Data Retrieval using the Content Exploration Interface, with support for complex, low-latency queries.

API Descriptions

Core APIs

  1. Repository Connection API: Manages connection to external data sources.

    • Example Methods: link/unlink, upload/download.
  2. Content Management API: Allows querying, editing, and locating content based on metadata and tags.

    • Example Methods: queryContent, editContent.
  3. Content Exploration API: Supports advanced data operations with low-latency retrieval.

    • Example Methods: processQuery, executeDAG.
  4. AI/ML Integration APIs: Facilitates data exchange for training and inference tasks within AI frameworks like TensorFlow or PyTorch.

    • Example Methods: defineDataset, prefetchToGPU.

Towards Agentic-Driven Scientific Workflows

Modern scientific computing is evolving towards agentic-driven workflows, where AI agents autonomously orchestrate complex computational tasks, interact with scientific data, and manage HPC resources through natural language interfaces. IOWarp is at the forefront of this transformation, providing the infrastructure and tools necessary to enable intelligent, automated scientific discovery.

Our work on agentic-driven scientific workloads includes the development of Warpio CLI and integration with Model Context Protocols (MCPs), enabling researchers to interact with HPC systems, scientific data formats, and computational workflows using natural language and AI-powered assistance.

Warpio CLI Interface

Warpio CLI: Command-line interface for agentic scientific workflows

AI Agents for Scientific Computing

Following Anthropic's November 2024 release of the Model Context Protocol (MCP), IOWarp has pioneered the integration of AI agents into scientific computing environments. These agents can:

  • Understand Scientific Data: Parse and analyze data from formats like HDF5, Adios BP5, NetCDF
  • Orchestrate Workflows: Submit jobs, manage resources, and coordinate multi-step computational pipelines
  • Generate Insights: Perform data analysis, create visualizations, and answer scientific queries
  • Ensure Reproducibility: Track provenance and enable replay of complex workflows

Model Context Protocols for Science

Warpio accessing National Data Platform

Warpio accessing the National Data Platform to download and analyze the latest EarthScope seismological datasets

IOWarp provides a growing collection of scientific MCPs available at github.com/iowarp/iowarp-mcps. Some examples include:

  • Adios MCP: Analyze and query Adios BP5 files from simulations like LAMMPS, WRF, and more
  • HDF5 MCP: Explore HDF5 datasets, read metadata, and extract scientific data
  • Jarvis MCP: Automated application deployment, environment management, and orchestration
  • Slurm MCP: Job scheduling, resource allocation, and queue management
  • System Monitoring MCPs: Real-time performance metrics and resource utilization tracking

Scientific Application Examples

IOWarp supports a wide range of scientific applications through its flexible architecture and scientific MCPs, enabling researchers to interact with complex data using natural language.

LAMMPS Molecular Dynamics Simulation

Atom Trajectory Visualization

Atom trajectory analysis from LAMMPS simulation

Detailed Atom Analysis

Detailed single-atom trajectory over time

Dataset: LAMMPS molecular dynamics simulation output stored in Adios BP5 format with atom position data over time.

Execution: Agents used Adios-MCP, which provides AI systems with the ability to read and query Adios BP5 files, to analyze the simulation data and generate visualizations.

Prompt: "Can you write me a python script that plots the trajectory of a single atom over time. The atom of choice should be a parameter to the script. The output should be a PNG image with the results. Please run the script for atom 23."

Result: The Adios-MCP enabled the agent to automatically read the BP5 file structure, extract position data for the specified atom, generate Python visualization code, execute the script, and produce the trajectory visualization. This eliminates the need for researchers to manually parse the Adios file format or write custom analysis scripts.

Pipe Flow Data Visualization via ParaView-MCP

Pipe flow isosurface and orthogonal slices

3D flow visualization combining isosurface and orthogonal slice planes generated through natural language commands

Dataset: Incompact3d pipe flow simulation (data.bp5) with pressure and velocity fields.

Execution: Agents used ParaView-MCP, a service that connects AI systems to the ParaView visualization engine, to load the dataset and generate the 3D visualization.

Prompt: "Explore the pipe flow data. Generate an isosurface of pressure at pp=0.1 plus three orthogonal slice planes (X, Y, Z)."

Result: Produced a combined 3D flow visualization without manual setup, highlighting internal flow structures. The ParaView-MCP enabled researchers to create complex visualizations through natural language, eliminating the need for manual ParaView configuration, Python scripting, or format conversion. The agent automatically:

  • Loaded the BP5 simulation data
  • Generated a pressure isosurface at the specified threshold
  • Created three orthogonal cutting planes
  • Composed the final visualization showing both surface and volumetric flow features

This demonstrates how IOWarp's MCP ecosystem transforms scientific visualization from a multi-step manual process into a conversational workflow, allowing researchers to focus on interpreting results rather than configuring visualization pipelines.


System Observability & Monitoring

IOWarp provides comprehensive observability features for tracking system performance and resource utilization in real-time.

IOWarp Observability Dashboard

Real-time observability dashboard for monitoring IOWarp performance and resource utilization

Features include:

  • Real-time Metrics: Performance tracking across all storage tiers
  • Resource Monitoring: DRAM, NVMe, GPU memory, and PFS utilization
  • Workflow Visualization: End-to-end workflow execution and bottleneck identification
  • Reproducibility Tracking: Full provenance capture for agentic workflows

Performance Results & Benchmarks

We evaluated ContextWarp (IOWarp's agentic framework) across multiple dimensions to understand the performance, correctness, and reliability of AI-driven scientific workflows. Experiments were conducted on Chameleon Cloud using compute nodes with 40GB A100 GPUs.

Agentic Workflow Performance: EarthScope Dataset Analysis

Experiment: Automated download and visualization of seismological data from the National Data Platform using AI agents.

Setup: Claude Code with Sonnet 4 model, with and without IOWarp Context (MCPs and machine configuration).

Task: "Use the ndp mcp to find the latest dataset of the earthscope organization, find the url of the geojson and csv they contain, and download them. The geojson is metadata, the csv contains seismograph data. Plot a figure containing the data of each axis on their own subfigure."

Results:

  • With IOWarp Context: Successfully completed in 2 minutes with correct tool selection and visualization
  • Without IOWarp Context: Failed to execute (agent could not identify correct tools or access methods)
  • Manual Process: ~15 minutes (login, search, download, build visualization script)
  • Speedup: 7.5x faster than manual workflow

This demonstrates that IOWarp Context (MCPs + system prompts) is essential for enabling AI agents to interact with scientific infrastructure effectively.

Agent Configuration Performance: Jarvis IOR Deployment

We compared eight agent-model combinations deploying the IOR parallel I/O benchmark using Jarvis (IOWarp's deployment automation system) across five distinct prompts to assess performance and correctness tradeoffs.

Average Deployment Duration by Agent-Model Configuration

Execution time comparison across different agent and model configurations

Execution Time Results (average across 5 prompts):

  • OpenCode + Devstral (local LLM for execution): 24.8 seconds (fastest)
  • Cursor + GPT-4o: 37.7 seconds
  • Gemini CLI + Gemini 2.5 Pro: 85.9 seconds
  • Claude Code + Sonnet 4: 109.2 seconds

Key Finding: Agents using small, self-hosted LLMs for execution (like Devstral) with large cloud models only for planning achieved competitive or superior performance compared to using large models for all operations. This validates IOWarp's split planning-execution design for efficiency, cost reduction, and data security.

Tool Call Success Rates

IOR Tool Call Success Rate Comparisons

Success rates for correctly identifying and invoking scientific tools across agent-model combinations

Robustness Across Diverse Prompts

Robustness Evaluation Across 20 Prompt Variations

Configuration success rates when tested with 20 different prompt styles ranging from simple to highly detailed

Robustness Test: 20 prompts with varying detail levels for IOR deployment, from simple ("Deploy ior with 8 processes using the deployment agent") to detailed bash scripts.

Configuration Success Rates (20 prompts):

  • Claude Code + Sonnet 4: 100%
  • Gemini CLI + Gemini 2.5 Pro: 100%
  • OpenCode CLI + Gemini 2.5 Pro: 100%
  • OpenCode CLI + Devstral-2hl: 100%
  • Cursor + GPT-5: Lower success rate (struggled with parameter synonyms like "nprocs" vs "number of processes")

Key Finding: IOWarp's agentic design is highly resilient to changes in prompt phrasing and information granularity. Configurations using local LLMs for execution achieved 100% success, demonstrating that the split plan-execution architecture effectively balances cost, security, and reliability.


Demonstrations

IOWarp-MCP Demo

Full workflow showcase from scheduling to deployment, data collection, and analysis using IOWarp MCPs with Claude Code.

IOWarp Reproducibility Demo

Showcase of IOWarp's reproducibility visualizer for agentic workflows, enabling full provenance tracking and replay.

IOWarp-MCP Adios Demo

Showcase of IOWarp Adios MCP providing full analysis of a BP5 file generated by LAMMPS, demonstrating natural language queries on scientific data.


Publications

Authors
Title
Venue
Type
Date
Links
H. Devarajan,
A. Kougkas,
H. Zheng,
V. Vishwanath,
X.-H. Sun
Stimulus: Accelerate Data Management for Scientific AI applications in HPCThe 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID'22), May 16-19, 2022ConferenceMay, 2022
J. Cernuda,
H. Devarajan,
L. Logan,
K. Bateman,
N. Rajesh,
J. Ye,
A. Kougkas,
X.-H. Sun
HFlow: A Dynamic and Elastic Multi-Layered Data ForwarderThe 2021 IEEE International Conference on Cluster Computing (CLUSTER'21), September 7-10, 2021ConferenceSeptember, 2021
N. Rajesh,
H. Devarajan,
J. Cernuda,
K. Bateman,
L. Logan,
J. Ye,
A. Kougkas,
X.-H. Sun
Apollo: An ML-assisted Real-Time Storage Resource ObserverThe 30th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'21), June 21-25, 2021ConferenceJune, 2021
H. Devarajan,
H. Zheng,
A. Kougkas,
X.-H. Sun,
V. Vishwanath
DLIO: A Data-Centric Benchmark for Scientific Deep Learning ApplicationsThe 2021 IEEE/ACM International Symposium in Cluster, Cloud, and Internet Computing (CCGrid'21), May 17 - 20, 2021 Best paper awardConferenceMay, 2021
H. Devarajan,
A. Kougkas,
X.-H. Sun
HReplica: A Dynamic Data Replication Engine with Adaptive Compression for Multi-Tiered StorageThe 2020 IEEE International Conference on Big Data (Big Data'20), December 10-13, 2020ConferenceDecember, 2020
H. Devarajan,
A. Kougkas,
X.-H. Sun
A Dynamic Multi-Tiered Storage System for Extreme Scale ComputingThe International Conference for High Performance Computing, Networking, Storage and Analysis (SC'20)PosterNovember, 2020
H. Devarajan,
A. Kougkas,
K. Bateman,
X.-H. Sun
HCL: Distributing Parallel Data Structures in Extreme ScalesIEEE International Conference on Cluster Computing (CLUSTER'20), Sept. 14-17, 2020ConferenceSeptember, 2020
H. Devarajan,
A. Kougkas,
L. Logan,
X.-H. Sun
HCompress: Hierarchical Data Compression for Multi-Tiered Storage EnvironmentsIEEE International Parallel and Distributed Processing Symposium (IPDPS'20), May 18-22, 2020ConferenceMay, 2020
H. Devarajan,
A. Kougkas,
X.-H. Sun
HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage EnvironmentsIEEE International Parallel and Distributed Processing Symposium (IPDPS'20), May 18-22, 2020ConferenceMay, 2020
A. Kougkas,
H. Devarajan,
X.-H. Sun
I/O Acceleration via Multi-Tiered Data Buffering and PrefetchingJournal of Computer Science and Technology (JCST'20), vol 35. no 1. pp 92-120JournalJanuary, 2020
H. Devarajan,
A. Kougkas,
X.-H. Sun
HFetch: Hierarchical Data Prefetching in Multi-Tiered Storage EnvironmentsThe International Conference for High Performance Computing, Networking, Storage and Analysis (SC'19) Best Poster Nominee, Ph.D ForumPosterNovember, 2019
A. Kougkas,
H. Devarajan,
J. Lofstead,
X.-H. Sun
LABIOS: A Distributed Label-Based I/O SystemThe 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'19), Phoenix, USA 2019. pp. 13-24. Karsten Schwan Best Paper AwardConferenceJune, 2019
H. Devarajan,
A. Kougkas,
X.-H. Sun
An Intelligent, Adaptive, and Flexible Data Compression FrameworkIEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing (CCGrid'19), Larnaca, Cyprus2019. pp. 82-91.ConferenceMay, 2019
H. Devarajan,
A. Kougkas,
P. Challa,
X.-H. Sun
Vidya: Performing Code-Block I/O Characterization for Data Access OptimizationThe IEEE International Conference on High Performance Computing, Data, and Analytics 2018 (HiPC'18), Bengaluru, India2018. pp. 255-264.ConferenceDecember, 2018
A. Kougkas,
H. Devarajan,
X.-H. Sun,
J. Lofstead
Harmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-Volatile Burst BuffersThe IEEE International Conference on Cluster Computing 2018 (Cluster'18), Belfast, UK2018. pp. 290-301.ConferenceSeptember, 2018
A. Kougkas,
H. Devarajan,
X.-H. Sun
Hermes: A Heterogeneous-Aware Multi-Tiered Distributed I/O Buffering SystemThe 27th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), Tempe, AZ, USA, 2018. pp. 219-230ConferenceJune, 2018


Collaborators

HDF Group

HDF Group

University of Utah

University of Utah

Deployment Partners

IOWarp is deployed and being evaluated at leading national laboratories:

  • Argonne National Laboratory - Advanced computing research and leadership computing facilities
  • Lawrence Livermore National Laboratory - High-performance computing systems and simulation science
  • NERSC (National Energy Research Scientific Computing Center) - DOE scientific computing facility

National Science Foundation

National Science Foundation

Award #2411318 (2024-2029)

$5 Million