Learning robust models under distribution shifts between training and test datasets is a fundamental challenge in machine learning. While learning invariant features across environments is a popular approach, it often assumes that these features are fully observed in both training and test sets-a condition frequently violated in practice. When models rely on invariant features absent in the test set, their robustness in new environments can deteriorate. To tackle this problem, we introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework, which focuses on learning a sufficient subset of invariant features rather than relying on a single feature. After demonstrating the limitation of existing invariant learning methods, we propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima across the environments. We theoretically demonstrate that finding a common flat minima enables robust predictions based on diverse invariant features. Empirical evaluations on multiple datasets, including our new benchmark, confirm ASGDRO's robustness against distribution shifts, highlighting the limitations of existing methods.
翻译:在训练集与测试集之间存在分布偏移的情况下学习鲁棒模型是机器学习中的一个基本挑战。尽管跨环境学习不变特征是一种流行方法,但它通常假设这些特征在训练集和测试集中都被完整观测——这一条件在实践中经常被违反。当模型依赖于测试集中缺失的不变特征时,其在新环境中的鲁棒性可能会恶化。为解决此问题,我们提出了一种称为充分不变性学习(SIL)框架的新学习原则,该框架专注于学习一个充分的不变特征子集,而非依赖单一特征。在证明现有不变性学习方法的局限性后,我们提出了一种新算法——自适应锐度感知的组分布鲁棒优化(ASGDRO),通过寻找跨环境的共同平坦最小值来学习多样化的不变特征。我们从理论上证明,寻找共同平坦最小值能够基于多样化的不变特征实现鲁棒预测。在多个数据集(包括我们提出的新基准)上的实证评估证实了ASGDRO对分布偏移的鲁棒性,同时凸显了现有方法的局限性。