Learning on graphs, where instance nodes are inter-connected, has become one of the central problems for deep learning, as relational structures are pervasive and induce data inter-dependence which hinders trivial adaptation of existing approaches that assume inputs to be i.i.d.~sampled. However, current models mostly focus on improving testing performance of in-distribution data and largely ignore the potential risk w.r.t. out-of-distribution (OOD) testing samples that may cause negative outcome if the prediction is overconfident on them. In this paper, we investigate the under-explored problem, OOD detection on graph-structured data, and identify a provably effective OOD discriminator based on an energy function directly extracted from graph neural networks trained with standard classification loss. This paves a way for a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe. It also has nice theoretical properties that guarantee an overall distinguishable margin between the detection scores for in-distribution and OOD samples, which, more critically, can be further strengthened by a learning-free energy belief propagation scheme. For comprehensive evaluation, we introduce new benchmark settings that evaluate the model for detecting OOD data from both synthetic and real distribution shifts (cross-domain graph shifts and temporal graph shifts). The results show that GNNSafe achieves up to $17.0\%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
翻译:在图学习中,实例节点相互连接,这已成为深度学习的核心问题之一,因为关系结构普遍存在且导致数据相互依赖,阻碍了假设输入为独立同分布采样的现有方法的简单适配。然而,当前模型大多专注于提升分布内数据的测试性能,而很大程度上忽略了与分布外测试样本相关的潜在风险——若模型对这些样本过度自信预测,可能引发负面后果。本文研究了图表征数据中尚未充分探索的分布外检测问题,并基于直接从使用标准分类损失训练的图神经网络中提取的能量函数,识别出一种可证明有效的分布外判别器。这为基于GNN的图学习提供了一种简单、强大且高效的分布外检测模型,我们称之为GNNSafe。该模型还具备优良的理论性质,能保证分布内与分布外样本检测分数之间存在显著的整体区分边界,更关键的是,这种边界可通过无需学习的能量置信传播方案进一步增强。为进行综合评估,我们引入了新的基准测试设置,用于评估模型在合成分布偏移和真实分布偏移(跨域图偏移与时序图偏移)下检测分布外数据的能力。结果表明,GNNSafe的AUROC相较于现有最优方法提升了高达17.0%,且可作为这一发展不足领域中的简单而强大的基线方法。