We formulate sequential maximum a posteriori inference as a recursion of loss functions and reduce the problem of continual learning to approximating the previous loss function. We then propose two coreset-free methods: autodiff quadratic consolidation, which uses an accurate and full quadratic approximation, and neural consolidation, which uses a neural network approximation. These methods are not scalable with respect to the neural network size, and we study them for classification tasks in combination with a fixed pre-trained feature extractor. We also introduce simple but challenging classical task sequences based on Iris and Wine datasets. We find that neural consolidation performs well in the classical task sequences, where the input dimension is small, while autodiff quadratic consolidation performs consistently well in image task sequences with a fixed pre-trained feature extractor, achieving comparable performance to joint maximum a posteriori training in many cases.
翻译:本文将序列最大后验推断形式化为损失函数的递归过程,并将持续学习问题转化为对先前损失函数的逼近问题。我们随后提出两种无需核心集的方法:自动微分二次整合(采用精确的完整二次逼近)与神经整合(采用神经网络逼近)。这些方法的可扩展性受限于神经网络规模,因此我们结合固定预训练特征提取器在分类任务中对其展开研究。同时,我们基于Iris和Wine数据集构建了简洁而具有挑战性的经典任务序列。实验发现:在输入维度较小的经典任务序列中,神经整合表现优异;而在采用固定预训练特征提取器的图像任务序列中,自动微分二次整合始终保持良好性能,在多數情况下达到与联合最大后验训练相当的效果。