iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations

Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM). These sparse operations are harder to optimize by manual tuning because their performance depends significantly on the sparsity of input graphs, GNN models, and computing platforms. To address this challenge, we present iSpLib, a PyTorch-based C++ library equipped with auto-tuned sparse operations. iSpLib expedites GNN training with a cache-enabled backpropagation that stores intermediate matrices in local caches. The library offers a user-friendly Python plug-in that allows users to take advantage of our optimized PyTorch operations out-of-the-box for any existing linear algebra-based PyTorch implementation of popular GNNs (Graph Convolution Network, GraphSAGE, Graph Inference Network, etc.) with only two lines of additional code. We demonstrate that iSpLib obtains up to 27x overall training speedup compared to the equivalent PyTorch 2.1.0 and PyTorch Geometric 2.4.0 implementations on the CPU. Our library is publicly available at https://github.com/HipGraph/iSpLib (https://doi.org/10.5281/zenodo.10806511).

翻译：图神经网络（GNN）训练与推理中的核心计算通常映射为稀疏矩阵操作，例如稀疏-稠密矩阵乘法（SpMM）。由于这些稀疏操作的性能高度依赖于输入图的稀疏性、GNN模型以及计算平台，因此手动调优优化难度较大。为解决这一挑战，我们提出iSpLib——一个基于PyTorch的C++库，配备自动调优的稀疏操作。iSpLib通过启用缓存的反向传播机制，将中间矩阵存储于本地缓存，从而加速GNN训练。该库提供用户友好的Python插件，用户仅需添加两行代码，即可直接利用我们优化的PyTorch操作，用于任意基于线性代数的流行GNN（如图卷积网络、GraphSAGE、图推理网络等）的现有PyTorch实现。实验表明，与等价的PyTorch 2.1.0和PyTorch Geometric 2.4.0在CPU上的实现相比，iSpLib可实现高达27倍的整体训练加速。我们的库已开源至https://github.com/HipGraph/iSpLib（https://doi.org/10.5281/zenodo.10806511）。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日