In this work, we aim to establish a strong connection between two significant bodies of machine learning research: continual learning and sequence modeling. That is, we propose to formulate continual learning as a sequence modeling problem, allowing advanced sequence models to be utilized for continual learning. Under this formulation, the continual learning process becomes the forward pass of a sequence model. By adopting the meta-continual learning (MCL) framework, we can train the sequence model at the meta-level, on multiple continual learning episodes. As a specific example of our new formulation, we demonstrate the application of Transformers and their efficient variants as MCL methods. Our experiments on seven benchmarks, covering both classification and regression, show that sequence models can be an attractive solution for general MCL.
翻译:在本研究中,我们旨在建立机器学习两大重要领域——持续学习与序列建模——之间的紧密联系。具体而言,我们提出将持续学习形式化为一个序列建模问题,从而使先进的序列模型能够应用于持续学习。在此形式化下,持续学习过程即成为序列模型的前向传播过程。通过采用元持续学习(MCL)框架,我们可以在元层面训练序列模型,使其在多个持续学习情节中学习。作为这种新形式化的具体实例,我们展示了Transformer及其高效变体作为MCL方法的应用。我们在涵盖分类与回归任务的七个基准上的实验表明,序列模型可成为通用MCL的有效解决方案。