In this work, we investigate the problem of out-of-distribution (OOD) generalization for unsupervised learning methods on graph data. This scenario is particularly challenging because graph neural networks (GNNs) have been shown to be sensitive to distributional shifts, even when labels are available. To address this challenge, we propose a \underline{M}odel-\underline{A}gnostic \underline{R}ecipe for \underline{I}mproving \underline{O}OD generalizability of unsupervised graph contrastive learning methods, which we refer to as MARIO. MARIO introduces two principles aimed at developing distributional-shift-robust graph contrastive methods to overcome the limitations of existing frameworks: (i) Information Bottleneck (IB) principle for achieving generalizable representations and (ii) Invariant principle that incorporates adversarial data augmentation to obtain invariant representations. To the best of our knowledge, this is the first work that investigates the OOD generalization problem of graph contrastive learning, with a specific focus on node-level tasks. Through extensive experiments, we demonstrate that our method achieves state-of-the-art performance on the OOD test set, while maintaining comparable performance on the in-distribution test set when compared to existing approaches. The source code for our method can be found at: https://github.com/ZhuYun97/MARIO
翻译:本文研究图数据上无监督学习方法的分布外(OOD)泛化问题。该场景极具挑战性,因为图神经网络(GNN)已被证明对分布偏移敏感,即使存在标签时也是如此。为解决此挑战,我们提出了一种提升无监督图对比学习方法OOD泛化能力的模型无关方案,简称MARIO。MARIO引入两项原则,旨在开发对分布偏移鲁棒的图对比方法以克服现有框架的局限性:(i)信息瓶颈(IB)原则,用于获得可泛化的表示;(ii)不变性原则,通过引入对抗性数据增广来获得不变表示。据我们所知,这是首个研究图对比学习(特别聚焦于节点级任务)中OOD泛化问题的工作。通过大量实验,我们证明该方法在OOD测试集上达到了最先进性能,同时在分布内测试集上保持了与现有方法相当的性能。本方法的源代码可见于:https://github.com/ZhuYun97/MARIO