Sequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and test whether having access to the true posterior is guaranteed to prevent catastrophic forgetting in Bayesian neural networks. To do this we perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. In this vein, we also propose a simple baseline called Prototypical Bayesian Continual Learning, which is competitive with state-of-the-art Bayesian continual learning methods on class incremental continual learning vision benchmarks.
翻译:序列贝叶斯推断可用于持续学习,以预防对过去任务的灾难性遗忘,并在学习新任务时提供信息性先验。我们重新审视了序列贝叶斯推断,并检验在贝叶斯神经网络中,能否通过获取真实后验分布来保证预防灾难性遗忘。为此,我们使用哈密顿蒙特卡洛方法进行序列贝叶斯推断,通过拟合密度估计器对哈密顿蒙特卡洛样本进行处理,将后验分布作为新任务的先验进行传播。研究发现,该方法未能阻止灾难性遗忘,这揭示了在神经网络中执行序列贝叶斯推断的困难性。随后,我们研究了序列贝叶斯推断与持续学习的简单解析示例,重点指出模型误设问题——即使进行精确推断,仍可能导致次优的持续学习性能。此外,我们讨论了任务数据不平衡如何引发遗忘。基于这些局限性,我们认为需要构建持续学习生成过程的概率模型,而非依赖对贝叶斯神经网络权重的序列贝叶斯推断。为此,我们提出一种名为原型贝叶斯持续学习的简单基线方法,该方法在类别增量持续学习视觉基准测试中,与最先进的贝叶斯持续学习方法具有竞争力。