Due to the significant computational challenge of training large-scale graph neural networks (GNNs), various sparse learning techniques have been exploited to reduce memory and storage costs. Examples include \textit{graph sparsification} that samples a subgraph to reduce the amount of data aggregation and \textit{model sparsification} that prunes the neural network to reduce the number of trainable weights. Despite the empirical successes in reducing the training cost while maintaining the test accuracy, the theoretical generalization analysis of sparse learning for GNNs remains elusive. To the best of our knowledge, this paper provides the first theoretical characterization of joint edge-model sparse learning from the perspective of sample complexity and convergence rate in achieving zero generalization error. It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy. Although the analysis is centered on two-layer GNNs with structural constraints on data, the insights are applicable to more general setups and justified by both synthetic and practical citation datasets.
翻译:由于训练大规模图神经网络(GNN)存在显著的计算挑战,各种稀疏学习技术已被用于降低内存和存储成本。例如,\textit{图稀疏化}通过采样子图来减少数据聚合量,而\textit{模型稀疏化}则通过剪枝神经网络来减少可训练权重的数量。尽管这些方法在降低训练成本的同时保持测试准确率方面取得了实证成功,但关于GNN稀疏学习的理论泛化分析仍不明确。据我们所知,本文首次从样本复杂度和收敛速率角度,理论刻画了联合边-模型稀疏学习在实现零泛化误差方面的特性。分析证明,同时采样重要节点和剪枝最低幅度神经元能够在不牺牲测试准确率的前提下,降低样本复杂度并提升收敛速度。尽管分析聚焦于具有数据结构约束的两层GNN,但其见解可推广至更一般的设置,并通过合成数据集和实际引文数据集得到了验证。