There is increasing evidence suggesting neural networks' sensitivity to distribution shifts, so that research on out-of-distribution (OOD) generalization comes into the spotlight. Nonetheless, current endeavors mostly focus on Euclidean data, and its formulation for graph-structured data is not clear and remains under-explored, given two-fold fundamental challenges: 1) the inter-connection among nodes in one graph, which induces non-IID generation of data points even under the same environment, and 2) the structural information in the input graph, which is also informative for prediction. In this paper, we formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM), that facilitates graph neural networks to leverage invariance principles for prediction. EERM resorts to multiple context explorers (specified as graph structure editers in our case) that are adversarially trained to maximize the variance of risks from multiple virtual environments. Such a design enables the model to extrapolate from a single observed environment which is the common case for node-level prediction. We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution and further demonstrate its power on various real-world datasets for handling distribution shifts from artificial spurious features, cross-domain transfers and dynamic graph evolution.
翻译:越来越多的证据表明神经网络对分布偏移具有敏感性,因此关于分布外泛化的研究备受关注。然而,当前的研究主要集中在欧几里得数据上,而针对图结构数据的分布外泛化问题尚未明确界定且研究不足,这主要源于两个基本挑战:1)图中节点间的相互连接导致即使在相同环境下数据点的生成也呈现非独立同分布特性;2)输入图的结构信息本身对预测具有重要价值。本文系统阐述了图上的分布外泛化问题,并提出了一种新的不变性学习方法——探索式外推风险最小化,该方法能够帮助图神经网络利用不变性原则进行预测。EERM通过多个上下文探索器(本文中具体实现为图结构编辑器)进行对抗训练,以最大化来自多个虚拟环境的风险方差。这种设计使得模型能够从单一观测环境(节点级预测的常见场景)中进行外推。我们通过理论证明该方法能确保获得有效的分布外解,从而验证了其有效性,并进一步在多个真实数据集上展示了其在处理人工伪特征、跨域迁移以及动态图演化引起的分布偏移方面的强大能力。