Due to inappropriate sample selection and limited training data, a distribution shift often exists between the training and test sets. This shift can adversely affect the test performance of Graph Neural Networks (GNNs). Existing approaches mitigate this issue by either enhancing the robustness of GNNs to distribution shift or reducing the shift itself. However, both approaches necessitate retraining the model, which becomes unfeasible when the model structure and parameters are inaccessible. To address this challenge, we propose FR-GNN, a general framework for GNNs to conduct feature reconstruction. FRGNN constructs a mapping relationship between the output and input of a well-trained GNN to obtain class representative embeddings and then uses these embeddings to reconstruct the features of labeled nodes. These reconstructed features are then incorporated into the message passing mechanism of GNNs to influence the predictions of unlabeled nodes at test time. Notably, the reconstructed node features can be directly utilized for testing the well-trained model, effectively reducing the distribution shift and leading to improved test performance. This remarkable achievement is attained without any modifications to the model structure or parameters. We provide theoretical guarantees for the effectiveness of our framework. Furthermore, we conduct comprehensive experiments on various public datasets. The experimental results demonstrate the superior performance of FRGNN in comparison to multiple categories of baseline methods.
翻译:由于不恰当的样本选择和有限的训练数据,训练集与测试集之间常存在分布偏移。这种偏移会对图神经网络(GNN)的测试性能产生不利影响。现有方法通过增强GNN对分布偏移的鲁棒性或直接减小偏移本身来缓解这一问题。然而,这两种方法都需要重新训练模型,当模型结构和参数不可访问时便无法实施。为解决这一挑战,我们提出FR-GNN——一个面向GNN进行特征重建的通用框架。FRGNN通过构建训练良好的GNN输出与输入之间的映射关系,获取类别代表性嵌入,进而利用这些嵌入重建标记节点的特征。这些重建后的特征被纳入GNN的消息传递机制,在测试阶段影响未标记节点的预测。值得注意的是,重建的节点特征可直接用于测试已训练好的模型,有效减小分布偏移并提升测试性能。这一显著成果的实现无需对模型结构或参数进行任何修改。我们为该框架的有效性提供了理论保证。此外,我们在多个公开数据集上进行了全面实验。实验结果表明,FRGNN相比多类基线方法具有更优性能。