Graph Convolutional Neural Networks (GCNs) possess strong capabilities for processing graph data in non-grid domains. They can capture the topological logical structure and node features in graphs and integrate them into nodes' final representations. GCNs have been extensively studied in various fields, such as recommendation systems, social networks, and protein molecular structures. With the increasing application of graph neural networks, research has focused on improving their performance while compressing their size. In this work, a plug-in module named Graph Knowledge Enhancement and Distillation Module (GKEDM) is proposed. GKEDM can enhance node representations and improve the performance of GCNs by extracting and aggregating graph information via multi-head attention mechanism. Furthermore, GKEDM can serve as an auxiliary transferor for knowledge distillation. With a specially designed attention distillation method, GKEDM can distill the knowledge of large teacher models into high-performance and compact student models. Experiments on multiple datasets demonstrate that GKEDM can significantly improve the performance of various GCNs with minimal overhead. Furthermore, it can efficiently transfer distilled knowledge from large teacher networks to small student networks via attention distillation.
翻译:图卷积神经网络(GCN)具备处理非网格域图数据的强大能力,能够捕获图中的拓扑逻辑结构和节点特征,并将其整合为节点的最终表示。GCN已在推荐系统、社交网络和蛋白质分子结构等多个领域得到广泛研究。随着图神经网络应用的日益增加,如何在压缩模型尺寸的同时提升其性能成为研究重点。本文提出了一种名为图知识增强与蒸馏模块(GKEDM)的即插即用模块。GKEDM通过多头注意力机制提取并聚合图信息,能够增强节点表示并提升GCN的性能。此外,GKEDM还可作为知识蒸馏的辅助转化器,通过专门设计的注意力蒸馏方法,将大型教师模型的知识蒸馏至高性能且紧凑的学生模型中。在多个数据集上的实验表明,GKEDM能以极小的开销显著提升各类GCN的性能,并能通过注意力蒸馏将大型教师网络的知识高效迁移至小型学生网络。