Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations. Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations. IRM was introduced by Arjovsky et al. (2019) and extended by Ahuja et al. (2020). IRM assumes that all environments are available to the learning system at the same time. With this work, we generalize the concept of IRM to scenarios where environments are observed sequentially. We show that existing approaches, including those designed for continual learning, fail to identify the invariant features and models across sequentially presented environments. We extend IRM under a variational Bayesian and bilevel framework, creating a general approach to continual invariant risk minimization. We also describe a strategy to solve the optimization problems using a variant of the alternating direction method of multiplier (ADMM). We show empirically using multiple datasets and with multiple sequential environments that the proposed methods outperform or is competitive with prior approaches.
翻译:经验风险最小化在学习模型未能捕获不变特征表示时,可能导致在未见环境中的泛化性能较差。不变风险最小化(IRM)是一种近期提出的用于发现环境不变表示的方法。IRM由Arjovsky等人(2019)提出,并由Ahuja等人(2020)扩展。IRM假设所有环境同时可供学习系统使用。在本研究中,我们将IRM的概念推广到环境按顺序观察的场景。我们证明,现有方法(包括专为持续学习设计的方法)无法在顺序呈现的环境中识别不变特征和模型。我们在变分贝叶斯和双层框架下扩展了IRM,构建了一种通用的持续不变风险最小化方法。我们还描述了一种使用交替方向乘子法(ADMM)变体来解决优化问题的策略。通过使用多个数据集和多个顺序环境的实验,我们实证表明所提出的方法优于或可与现有方法竞争。