In this work, we investigate the problem of out-of-distribution (OOD) generalization for unsupervised learning methods on graph data. This scenario is particularly challenging because graph neural networks (GNNs) have been shown to be sensitive to distributional shifts, even when labels are available. To address this challenge, we propose a \underline{M}odel-\underline{A}gnostic \underline{R}ecipe for \underline{I}mproving \underline{O}OD generalizability of unsupervised graph contrastive learning methods, which we refer to as MARIO. MARIO introduces two principles aimed at developing distributional-shift-robust graph contrastive methods to overcome the limitations of existing frameworks: (i) Information Bottleneck (IB) principle for achieving generalizable representations and (ii) Invariant principle that incorporates adversarial data augmentation to obtain invariant representations. To the best of our knowledge, this is the first work that investigates the OOD generalization problem of graph contrastive learning, with a specific focus on node-level tasks. Through extensive experiments, we demonstrate that our method achieves state-of-the-art performance on the OOD test set, while maintaining comparable performance on the in-distribution test set when compared to existing approaches. The source code for our method can be found at: https://github.com/ZhuYun97/MARIO
翻译:本文研究了图数据中无监督学习方法的分布外(OOD)泛化问题。该场景极具挑战性,因为图神经网络(GNNs)即使在有标签的情况下也对分布偏移敏感。为应对这一挑战,我们提出了一种模型无关的**M**odel-**A**gnostic **R**ecipe for **I**mproving **O**OD泛化能力方案(简称MARIO),用于提升无监督图对比学习方法的OOD泛化性。MARIO引入两个旨在开发鲁棒分布偏移的图对比方法的原则,以克服现有框架的局限性:(i)信息瓶颈(IB)原则,用于实现可泛化的表征;(ii)不变性原则,通过引入对抗数据增强获得不变表征。据我们所知,这是首个研究图对比学习OOD泛化问题的成果,重点关注节点级任务。通过大量实验,我们证明该方法在OOD测试集上达到最先进性能,同时在分布内测试集上保持与现有方法相当的性能。本方法的源代码可在https://github.com/ZhuYun97/MARIO 获取。