Machine learning methods can be unreliable when deployed in domains that differ from the domains on which they were trained. There are a wide range of proposals for mitigating this problem by learning representations that are ``invariant'' in some sense.However, these methods generally contradict each other, and none of them consistently improve performance on real-world domain shift benchmarks. There are two main questions that must be addressed to understand when, if ever, we should use each method. First, how does each ad hoc notion of ``invariance'' relate to the structure of real-world problems? And, second, when does learning invariant representations actually yield robust models? To address these issues, we introduce a broad formal notion of what it means for a real-world domain shift to admit invariant structure. Then, we characterize the causal structures that are compatible with this notion of invariance.With this in hand, we find conditions under which method-specific invariance notions correspond to real-world invariant structure, and we clarify the relationship between invariant structure and robustness to domain shifts. For both questions, we find that the true underlying causal structure of the data plays a critical role.
翻译:机器学习方法在部署到与训练领域不同的领域时可能不可靠。为解决这一问题,已有大量研究提出通过学习某种意义上的“不变”表示来缓解该问题。然而,这些方法通常相互矛盾,且没有一种能在现实领域偏移基准测试中持续提升性能。要理解这些方法何时(如果存在的话)应当被使用,需回答两个关键问题:第一,每种特定概念下的“不变性”如何与现实问题的结构相关联?第二,学习不变表示何时能真正产生鲁棒模型?为回答这些问题,我们提出一个广义形式化概念,用以刻画现实领域偏移允许不变结构存在的条件。随后,我们刻画了与这种不变性概念相容的因果结构。基于此,我们找到了方法特定的不变性概念对应于现实不变结构的条件,并阐明了不变结构与领域偏移鲁棒性之间的关系。对于这两个问题,我们发现数据真实的潜在因果结构起着关键作用。