Accurate uncertainty quantification in graph neural networks (GNNs) is essential, especially in high-stakes domains where GNNs are frequently employed. Conformal prediction (CP) offers a promising framework for quantifying uncertainty by providing $\textit{valid}$ prediction sets for any black-box model. CP ensures formal probabilistic guarantees that a prediction set contains a true label with a desired probability. However, the size of prediction sets, known as $\textit{inefficiency}$, is influenced by the underlying model and data generating process. On the other hand, Bayesian learning also provides a credible region based on the estimated posterior distribution, but this region is $\textit{well-calibrated}$ only when the model is correctly specified. Building on a recent work that introduced a scaling parameter for constructing valid credible regions from posterior estimate, our study explores the advantages of incorporating a temperature parameter into Bayesian GNNs within CP framework. We empirically demonstrate the existence of temperatures that result in more efficient prediction sets. Furthermore, we conduct an analysis to identify the factors contributing to inefficiency and offer valuable insights into the relationship between CP performance and model calibration.
翻译:在图神经网络(GNN)中准确量化不确定性至关重要,尤其在GNN被频繁使用的高风险领域。保形预测(CP)为任何黑箱模型提供具有形式化概率保证的预测集,确保预测集以期望概率包含真实标签。然而,预测集的大小(即"低效性")受底层模型和数据生成过程的影响。另一方面,贝叶斯学习也能基于后验分布估计提供可信区间,但该区间仅在模型正确设定时才具备良好的校准性。基于近期引入缩放参数构建有效可信区间的研究工作,我们探索了在CP框架下将温度参数纳入贝叶斯GNN的益处。通过实验验证了存在能使预测集更高效的温度值,并深入分析了导致低效性的因素,揭示了CP性能与模型校准之间的内在关联。