Learning models that are robust to distribution shifts is a key concern in the context of their real-life applicability. Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments. The success of IRM requires an important assumption: the underlying causal mechanisms/features remain invariant across environments. When not satisfied, we show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $\textit{partial invariance}$. In this work, we theoretically highlight the sub-optimality of IRM and then demonstrate how learning from a partition of training domains can help improve invariant models. Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
翻译:在真实应用场景下,学习对分布偏移具有鲁棒性的模型是一个关键问题。不变风险最小化(IRM)是一种流行的框架,旨在从多个环境中学习鲁棒模型。IRM的成功依赖于一个重要的假设:潜在的因果机制/特征在不同环境之间保持不变。当这一假设不成立时,我们证明IRM可能会过度约束预测器,并为此提出一种通过$\textit{部分不变性}$的松弛方法。在本文中,我们从理论上揭示了IRM的次优性,然后论证了如何从训练域的分区中学习有助于改进不变模型。在线性设置以及基于语言和图像数据的深度神经网络任务上进行的多项实验,使我们能够验证得出的结论。