Dealing with representation shift is one of the main problems in online continual learning. Current methods mainly solve this by reducing representation shift, but leave the classifier on top of the representation to slowly adapt, in many update steps, to the remaining representation shift, increasing forgetting. We propose DeepCCG, an empirical Bayesian approach to solve this problem. DeepCCG works by updating the posterior of a class conditional Gaussian classifier such that the classifier adapts instantly to representation shift. The use of a class conditional Gaussian classifier also enables DeepCCG to use a log conditional marginal likelihood loss to update the representation, which can be seen as a new type of replay. To perform the update to the classifier and representation, DeepCCG maintains a fixed number of examples in memory and so a key part of DeepCCG is selecting what examples to store, choosing the subset that minimises the KL divergence between the true posterior and the posterior induced by the subset. We demonstrate the performance of DeepCCG on a range of settings, including those with overlapping tasks which thus far have been under-explored. In the experiments, DeepCCG outperforms all other methods, evidencing its potential.
翻译:应对表征漂移是在线持续学习的主要难题之一。现有方法主要通过减少表征漂移来解决这一问题,但会使得分类器在表征之上缓慢适应,需经过多次更新步骤来应对剩余的漂移,从而加剧遗忘。我们提出DeepCCG,一种基于经验贝叶斯的方法来解决此问题。DeepCCG通过更新类别条件高斯分类器的后验分布,使分类器能即时适应表征漂移。类别条件高斯分类器的使用还使DeepCCG能够采用对数条件边际似然损失来更新表征,这可视作一种新型重放机制。为执行分类器与表征的更新,DeepCCG在内存中保留固定数量的样本,因此其关键部分在于选择存储哪些样本——即选取能最小化真实后验与子集诱导后验之间KL散度的子集。我们在多种场景下验证了DeepCCG的性能,包括迄今研究不足的任务重叠场景。实验表明,DeepCCG在所有方法中表现最优,展现了其潜力。