Recent studies have shown that Binary Graph Neural Networks (GNNs) are promising for saving computations of GNNs through binarized tensors. Prior work, however, mainly focused on algorithm designs or training techniques, leaving it open to how to materialize the performance potential on accelerator hardware fully. This work redesigns the binary GNN inference backend from the efficiency perspective. It fills the gap by proposing a series of abstractions and techniques to map binary GNNs and their computations best to fit the nature of bit manipulations on GPUs. Results on real-world graphs with GCNs, GraphSAGE, and GraphSAINT show that the proposed techniques outperform state-of-the-art binary GNN implementations by 8-22X with the same accuracy maintained. BitGNN code is publicly available.
翻译:近期研究表明,通过二值化张量,二进制图神经网络(GNN)有望节省GNN的计算量。然而,先前的研究主要集中于算法设计或训练技术,尚未充分探讨如何在加速器硬件上充分发挥其性能潜力。本文从效率角度重新设计了二进制GNN的推理后端,通过提出一系列抽象和技术方法,将二进制GNN及其计算映射至最适合GPU位操作特性的计算模式,填补了这一空白。在GCN、GraphSAGE和GraphSAINT等真实图数据上的实验结果表明,所提方法在保持相同精度的前提下,性能较现有最先进的二进制GNN实现提升8-22倍。BitGNN代码已公开发布。