
A research team has introduced a new out-of-core mechanism, Capsule, for large-scale GNN training, which can achieve up to a 12.02× improvement in runtime efficiency, while using only 22.24% of the main memory, compared to SOTA out-of-core GNN systems. This work was published in the Proceedings of the ACM on Management of Data .The team included the Data Darkness Lab (DDL) at the Medical Imaging Intelligence and Robotics Research Center of the University of Science and Technology of China (USTC) Suzhou Institute.
Graph neural networks (GNNs) have demonstrated strengths in areas such as recommendation systems, natural language processing, computational chemistry, and bioinformatics. Popular training frameworks for GNNs, such as DGL and PyG, leverage GPU parallel processing power to extract structural information from graph data.
Despite its computational advantages offered by GPUs in GNN training, the limited GPU memory capacity struggles to accommodate large-scale graph data, making scalability a significant challenge for existing GNN systems. To address this issue, the DDL team proposed a new out-of-core (OOC) GNN training framework, Capsule, which provides a solution for large-scale GNN training.
Unlike existing out-of-core GNN frameworks, Capsule eliminates the I/O overhead between the CPU and GPU during the backpropagation process by using graph partitioning and pruning strategies, thus ensuring that the training subgraph structures and their features fit entirely into GPU memory. This boosts system performance.
Additionally, Capsule optimizes performance further by designing a subgraph loading mechanism based on the shortest Hamiltonian cycle and a pipelined parallel strategy. Moreover, Capsule is plug-and-play and can seamlessly integrate with mainstream open-source GNN training frameworks.
In tests using large-scale real-world graph datasets, Capsule outperformed the best existing systems, achieving up to a 12.02x performance improvement while using only 22.24% of the memory. It also provides a theoretical upper bound for the variance of the embeddings produced during training.
This work provides a new approach to the colossal graphical structures processed and the limited memory capacities of GPUs.
More information:
Yongan Xiang et al, Capsule: An Out-of-Core Training Mechanism for Colossal GNNs, Proceedings of the ACM on Management of Data (2025). DOI: 10.1145/3709669
Citation:
Novel out-of-core mechanism introduced for large-scale graph neural network training (2025, April 23)
retrieved 23 April 2025
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Leave a comment