Lottery Ticket Hypothesis (LTH) claims the existence of a winning ticket (i.e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance to the original dense network. A recent work, called UGS, extended LTH to prune graph neural networks (GNNs) for effectively accelerating GNN inference. UGS simultaneously prunes the graph adjacency matrix and the model weights using the same masking mechanism, but since the roles of the graph adjacency matrix and the weight matrices are very different, we find that their sparsifications lead to different performance characteristics. Specifically, we find that the performance of a sparsified GNN degrades significantly when the graph sparsity goes beyond a certain extent. Therefore, we propose two techniques to improve GNN performance when the graph sparsity is high. First, UGS prunes the adjacency matrix using a loss formulation which, however, does not properly involve all elements of the adjacency matrix; in contrast, we add a new auxiliary loss head to better guide the edge pruning by involving the entire adjacency matrix. Second, by regarding unfavorable graph sparsification as adversarial data perturbations, we formulate the pruning process as a min-max optimization problem to gain the robustness of lottery tickets when the graph sparsity is high. We further investigate the question: Can the "retrainable" winning ticket of a GNN be also effective for graph transferring learning? We call it the transferable graph lottery ticket (GLT) hypothesis. Extensive experiments were conducted which demonstrate the superiority of our proposed sparsification method over UGS, and which empirically verified our transferable GLT hypothesis.
翻译:彩票假说(Lottery Ticket Hypothesis, LTH)声称存在一种获胜彩票(即经过恰当剪枝的子网络及其原始权重初始化),其性能可与原始密集网络相媲美。近期工作UGS将LTH推广至图神经网络(GNN)剪枝,以有效加速GNN推理。UGS采用相同的掩码机制同时剪枝图邻接矩阵和模型权重,但由于图邻接矩阵与权重矩阵的作用截然不同,我们发现在稀疏化过程中二者会引发不同的性能特征。具体而言,当图稀疏程度超过某一阈值时,稀疏化后GNN的性能会显著下降。为此,我们提出两种技术以提升图稀疏度较高时GNN的性能:首先,UGS通过损失函数对邻接矩阵进行剪枝,但该损失函数未能合理涉及邻接矩阵的全部元素;相比之下,我们引入新的辅助损失头,通过完整利用邻接矩阵来更优地引导边剪枝。其次,通过将不利的图稀疏化视为对抗性数据扰动,我们将剪枝过程建模为极小极大优化问题,以增强高图稀疏度下彩票的鲁棒性。我们进一步探究以下问题:GNN中“可重新训练”的获胜彩票是否也能有效用于图迁移学习?我们将其称为可迁移图彩票假说(Transferable Graph Lottery Ticket Hypothesis, GLT)。大量实验表明,我们提出的稀疏化方法优于UGS,并实证验证了所提出的可迁移GLT假说。