Out-of-distribution (OOD) generalization is indispensable for learning models in the wild, where testing distribution typically unknown and different from the training. Recent methods derived from causality have shown great potential in achieving OOD generalization. However, existing methods mainly focus on the invariance property of causes, while largely overlooking the property of \textit{sufficiency} and \textit{necessity} conditions. Namely, a necessary but insufficient cause (feature) is invariant to distribution shift, yet it may not have required accuracy. By contrast, a sufficient yet unnecessary cause (feature) tends to fit specific data well but may have a risk of adapting to a new domain. To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause. To associate PNS with OOD generalization, we propose PNS risk and formulate an algorithm to learn representation with a high PNS value. We theoretically analyze and prove the generalizability of the PNS risk. Experiments on both synthetic and real-world benchmarks demonstrate the effectiveness of the proposed method. The details of the implementation can be found at the GitHub repository: https://github.com/ymy4323460/CaSN.
翻译:通过充分且必要原因的概率实现不变性学习
摘要:在野环境下,测试分布通常未知且与训练分布不同,因此分布外(OOD)泛化对学习模型不可或缺。近期基于因果关系的方法在实现OOD泛化方面展现出巨大潜力。然而,现有方法主要关注原因的因果不变性,却很大程度上忽略了原因的充分性与必要性条件。具体而言,一个必要但不充分的原因(特征)虽对分布偏移具有不变性,但可能无法达到所需精度。相比之下,一个充分但不必要的原因(特征)虽能良好拟合特定数据,却存在适应新领域的风险。为捕捉充分必要原因的信息,我们采用经典概念——充分且必要原因的概率(PNS),该概率表征某一因素是否为必要且充分原因的可能性。为将PNS与OOD泛化相关联,我们提出PNS风险,并设计一种算法以学习具有高PNS值的表示。我们从理论上分析并证明了PNS风险的泛化能力。在合成数据集与真实世界基准上的实验均验证了所提方法的有效性。实现细节详见GitHub仓库:https://github.com/ymy4323460/CaSN。