MuGSI: Distilling GNNs with Multi-Granularity Structural Information for Graph Classification

Recent works have introduced GNN-to-MLP knowledge distillation (KD) frameworks to combine both GNN's superior performance and MLP's fast inference speed. However, existing KD frameworks are primarily designed for node classification within single graphs, leaving their applicability to graph classification largely unexplored. Two main challenges arise when extending KD for node classification to graph classification: (1) The inherent sparsity of learning signals due to soft labels being generated at the graph level; (2) The limited expressiveness of student MLPs, especially in datasets with limited input feature spaces. To overcome these challenges, we introduce MuGSI, a novel KD framework that employs Multi-granularity Structural Information for graph classification. Specifically, we propose multi-granularity distillation loss in MuGSI to tackle the first challenge. This loss function is composed of three distinct components: graph-level distillation, subgraph-level distillation, and node-level distillation. Each component targets a specific granularity of the graph structure, ensuring a comprehensive transfer of structural knowledge from the teacher model to the student model. To tackle the second challenge, MuGSI proposes to incorporate a node feature augmentation component, thereby enhancing the expressiveness of the student MLPs and making them more capable learners. We perform extensive experiments across a variety of datasets and different teacher/student model architectures. The experiment results demonstrate the effectiveness, efficiency, and robustness of MuGSI. Codes are publicly available at: \textbf{\url{https://github.com/tianyao-aka/MuGSI}.}

翻译：近期研究引入了图神经网络到多层感知机的知识蒸馏框架，以结合图神经网络的卓越性能与多层感知机的快速推理优势。然而，现有知识蒸馏框架主要针对单图中的节点分类任务设计，其在图分类任务中的适用性尚未得到充分探索。将节点分类的知识蒸馏方法扩展至图分类时面临两大挑战：(1) 由于软标签在图级别生成而导致学习信号固有的稀疏性；(2) 学生多层感知机的表达能力有限，尤其在输入特征空间受限的数据集中更为突出。为克服这些挑战，本文提出MuGSI——一种利用多粒度结构信息进行图分类的新型知识蒸馏框架。具体而言，我们设计了多粒度蒸馏损失以应对第一个挑战。该损失函数由三个独立组件构成：图级蒸馏、子图级蒸馏与节点级蒸馏。每个组件针对图结构的不同粒度层级，确保结构知识从教师模型到学生模型的全面迁移。针对第二个挑战，MuGSI提出引入节点特征增强组件，从而提升学生多层感知机的表达能力，使其成为更高效的学习器。我们在多种数据集及不同教师/学生模型架构上进行了广泛实验。实验结果表明MuGSI具有显著的有效性、高效性与鲁棒性。代码已公开于：\textbf{\url{https://github.com/tianyao-aka/MuGSI}}。