Skip to main content

Trade-off Study of Localizing Communication and Balancing Network Traffic on Dragonfly System

Authors: X. Wang, X. Yang, M. Mubarak, R. Ross, Z. Lan

Date: May, 2018

Venue: The 32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS'18), Vancouver, Canada2018. pp. 1113-1122.

Type: Conference

Abstract

Dragonfly networks are being widely adopted in high-performance computing systems. On these networks, how- ever, interference caused by resource sharing can lead to sig- nificant network congestion and performance variability. We present a comparative analysis exploring the trade-off between localizing communication and balancing network traffic. We conduct trace-based simulations for applications with different communication patterns, using multiple job placement policies and routing mechanisms. We perform an in-depth performance analysis on representative applications individually and show that different applications have distinct preferences regarding localized communication and balanced network traffic. We fur- ther demonstrate the effect of external network interference by introducing background traffic and show that localized commu- nication can help reduce the application performance variation caused by network sharing.