I/O-Aware PIM Acceleration for Long-Sequence LLM Inference with Hybrid Sparse Attention
Authors: X. Lu, L. Hu, H. Huang, P. Jiang, X.-H. Sun
Date: May, 2026
Venue: The 40th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2026)
Type: Conference
Tags
I/O AnalysisHardware/Software Co-DesignPIMUniMCC