Graph Neural Networks (GNNs) have emerged as potent tools for predicting outcomes in graph-structured data. Despite their efficacy, a significant drawback of GNNs lies in their limited ability to provide robust uncertainty estimates, posing challenges to their reliability in contexts where errors carry significant consequences. Moreover, GNNs typically excel in in-distribution settings, assuming that training and test data follow identical distributions: a condition often unmet in real-world graph data scenarios. In this article, we leverage conformal prediction, a widely recognized statistical technique for quantifying uncertainty by transforming predictive model outputs into prediction sets, to address uncertainty quantification in GNN predictions amidst conditional shift \footnote{Representing the change in conditional probability distribution $P(label |input)$ from source domain to target domain.} in graph-based semi-supervised learning (SSL). Additionally, we propose a novel loss function aimed at refining model predictions by minimizing conditional shift in latent stages. Termed Conditional Shift Robust (CondSR) conformal prediction for GNNs, our approach CondSR is model-agnostic and adaptable to various classification models. We validate the effectiveness of our method on standard graph benchmark datasets, integrating it with state-of-the-art GNNs in node classification tasks. The code implementation is publicly available for further exploration and experimentation.
翻译:图神经网络(GNN)已成为处理图结构数据中预测任务的强大工具。尽管其效果显著,但GNN的一个重大缺陷在于无法提供稳健的不确定性估计,这对其在后果严重的场景中的可靠性构成挑战。此外,GNN通常在分布内设置下表现优异,假定训练数据和测试数据遵循相同分布——这一条件在真实图数据场景中常不成立。本文利用保形预测(一种通过将预测模型输出转换为预测集来量化不确定性的广泛认可的统计技术),解决图基半监督学习中条件偏移(表示条件概率分布$P(label|input)$从源域到目标域的变化)下GNN预测的不确定性量化问题。此外,我们提出一种新颖的损失函数,旨在通过最小化潜在阶段的条件偏移来优化模型预测。所提出的方法称为条件偏移鲁棒(CondSR)保形预测方法(针对GNN),具有模型无关性,可适应多种分类模型。我们在标准图基准数据集上验证了该方法的有效性,并将其与最先进的GNN结合用于节点分类任务。代码实现已公开,以供进一步探索和实验。