Dynamic graph neural network (DGNN) is becoming increasingly popular because of its widespread use in capturing dynamic features in the real world. A variety of dynamic graph neural networks designed from algorithmic perspectives have succeeded in incorporating temporal information into graph processing. Despite the promising algorithmic performance, deploying DGNNs on hardware presents additional challenges due to the model complexity, diversity, and the nature of the time dependency. Meanwhile, the differences between DGNNs and static graph neural networks make hardware-related optimizations for static graph neural networks unsuitable for DGNNs. In this paper, we select eight prevailing DGNNs with different characteristics and profile them on both CPU and GPU. The profiling results are summarized and analyzed, providing in-depth insights into the bottlenecks of DGNNs on hardware and identifying potential optimization opportunities for future DGNN acceleration. Followed by a comprehensive survey, we provide a detailed analysis of DGNN performance bottlenecks on hardware, including temporal data dependency, workload imbalance, data movement, and GPU warm-up. We suggest several optimizations from both software and hardware perspectives. This paper is the first to provide an in-depth analysis of the hardware performance of DGNN Code is available at https://github.com/sharc-lab/DGNN_analysis.
翻译:动态图神经网络因其在捕捉现实世界动态特征方面的广泛应用而日益流行。从算法角度设计的多种动态图神经网络已成功将时序信息融入图处理中。尽管算法性能表现良好,但由于模型复杂度、多样性以及时间依赖的特性,在硬件上部署动态图神经网络面临额外挑战。同时,动态图神经网络与静态图神经网络的差异,使得针对静态图神经网络的硬件优化不适用于动态图神经网络。本文选取八种具有不同特征的典型动态图神经网络,在CPU和GPU上进行性能剖析。通过总结分析剖析结果,深入揭示了动态图神经网络在硬件上的瓶颈,并识别了未来加速动态图神经网络的潜在优化机会。结合全面调研,我们详细分析了动态图神经网络在硬件上的性能瓶颈,包括时序数据依赖、负载不均衡、数据移动及GPU预热等问题。从软件和硬件两个层面提出了若干优化建议。本文首次对动态图神经网络的硬件性能进行深入分析。代码详见https://github.com/sharc-lab/DGNN_analysis。