Continual Invariant Risk Minimization

Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations. Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations. IRM was introduced by Arjovsky et al. (2019) and extended by Ahuja et al. (2020). IRM assumes that all environments are available to the learning system at the same time. With this work, we generalize the concept of IRM to scenarios where environments are observed sequentially. We show that existing approaches, including those designed for continual learning, fail to identify the invariant features and models across sequentially presented environments. We extend IRM under a variational Bayesian and bilevel framework, creating a general approach to continual invariant risk minimization. We also describe a strategy to solve the optimization problems using a variant of the alternating direction method of multiplier (ADMM). We show empirically using multiple datasets and with multiple sequential environments that the proposed methods outperform or is competitive with prior approaches.

翻译：经验风险最小化在学习模型未能捕获不变特征表示时，可能导致在未见环境中的泛化性能较差。不变风险最小化（IRM）是一种近期提出的用于发现环境不变表示的方法。IRM由Arjovsky等人（2019）提出，并由Ahuja等人（2020）扩展。IRM假设所有环境同时可供学习系统使用。在本研究中，我们将IRM的概念推广到环境按顺序观察的场景。我们证明，现有方法（包括专为持续学习设计的方法）无法在顺序呈现的环境中识别不变特征和模型。我们在变分贝叶斯和双层框架下扩展了IRM，构建了一种通用的持续不变风险最小化方法。我们还描述了一种使用交替方向乘子法（ADMM）变体来解决优化问题的策略。通过使用多个数据集和多个顺序环境的实验，我们实证表明所提出的方法优于或可与现有方法竞争。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日