With the ever-growing popularity of Graph Neural Networks (GNNs), efficient GNN inference is gaining tremendous attention. Field-Programming Gate Arrays (FPGAs) are a promising execution platform due to their fine-grained parallelism, low-power consumption, reconfigurability, and concurrent execution. Even better, High-Level Synthesis (HLS) tools bridge the gap between the non-trivial FPGA development efforts and rapid emergence of new GNN models. In this paper, we propose GNNHLS, an open-source framework to comprehensively evaluate GNN inference acceleration on FPGAs via HLS, containing a software stack for data generation and baseline deployment, and FPGA implementations of 6 well-tuned GNN HLS kernels. We evaluate GNNHLS on 4 graph datasets with distinct topologies and scales. The results show that GNNHLS achieves up to 50.8x speedup and 423x energy reduction relative to the CPU baselines. Compared with the GPU baselines, GNNHLS achieves up to 5.16x speedup and 74.5x energy reduction.
翻译:随着图神经网络(GNN)的日益普及,高效的GNN推理正受到广泛关注。现场可编程门阵列(FPGA)因其细粒度并行性、低功耗、可重构性和并发执行能力,成为一种极具潜力的执行平台。更重要的是,高层次综合(HLS)工具弥合了FPGA开发中非平凡的工作量与新型GNN模型快速涌现之间的鸿沟。本文提出GNNHLS,一个用于全面评估基于FPGA的GNN推理加速的开源框架(通过HLS实现),包含用于数据生成和基线部署的软件栈,以及6个精心调优的GNN HLS内核的FPGA实现。我们在4个具有不同拓扑结构和规模的图数据集上评估了GNNHLS。结果表明,相较于CPU基线,GNNHLS实现了高达50.8倍的加速比和423倍的能耗降低;相较于GPU基线,GNNHLS实现了高达5.16倍的加速比和74.5倍的能耗降低。