In this paper, we propose IMA-GNN as an In-Memory Accelerator for centralized and decentralized Graph Neural Network inference, explore its potential in both settings and provide a guideline for the community targeting flexible and efficient edge computation. Leveraging IMA-GNN, we first model the computation and communication latencies of edge devices. We then present practical case studies on GNN-based taxi demand and supply prediction and also adopt four large graph datasets to quantitatively compare and analyze centralized and decentralized settings. Our cross-layer simulation results demonstrate that on average, IMA-GNN in the centralized setting can obtain ~790x communication speed-up compared to the decentralized GNN setting. However, the decentralized setting performs computation ~1400x faster while reducing the power consumption per device. This further underlines the need for a hybrid semi-decentralized GNN approach.
翻译:本文提出IMA-GNN作为集中式和分布式图神经网络推理的内存加速器,探讨其在两种场景下的潜力,并为面向灵活高效边缘计算的研究社区提供指导。基于IMA-GNN,我们首先对边缘设备的计算和通信延迟进行建模,随后以GNN驱动的出租车供需预测为实际案例,并采用四个大规模图数据集对集中式与分布式场景进行量化对比分析。跨层仿真结果表明,与分布式GNN场景相比,集中式IMA-GNN平均可实现约790倍通信加速;然而,分布式场景的计算速度提升约1400倍,且设备功耗更低。这进一步凸显了混合半分布式GNN方法的必要性。