This paper proposes an assessor-guided learning strategy for continual learning where an assessor guides the learning process of a base learner by controlling the direction and pace of the learning process thus allowing an efficient learning of new environments while protecting against the catastrophic interference problem. The assessor is trained in a meta-learning manner with a meta-objective to boost the learning process of the base learner. It performs a soft-weighting mechanism of every sample accepting positive samples while rejecting negative samples. The training objective of a base learner is to minimize a meta-weighted combination of the cross entropy loss function, the dark experience replay (DER) loss function and the knowledge distillation loss function whose interactions are controlled in such a way to attain an improved performance. A compensated over-sampling (COS) strategy is developed to overcome the class imbalanced problem of the episodic memory due to limited memory budgets. Our approach, Assessor-Guided Learning Approach (AGLA), has been evaluated in the class-incremental and task-incremental learning problems. AGLA achieves improved performances compared to its competitors while the theoretical analysis of the COS strategy is offered. Source codes of AGLA, baseline algorithms and experimental logs are shared publicly in \url{https://github.com/anwarmaxsum/AGLA} for further study.
翻译:本文提出一种评估者引导的持续学习策略,其中评估者通过控制基学习器的学习方向与步速来引导其学习过程,从而在高效学习新环境的同时抵御灾难性干扰问题。评估者采用元学习方式训练,通过元目标函数提升基学习器的学习进程。该机制对每个样本实施软加权处理,接受正样本并拒绝负样本。基学习器的训练目标是最小化由交叉熵损失函数、黑暗经验回放(DER)损失函数与知识蒸馏损失函数构成的元加权组合,通过控制三者间的交互关系实现性能提升。为克服因记忆预算有限导致的场景记忆类别不平衡问题,我们开发了补偿过采样(COS)策略。所提出的评估者引导学习方法(AGLA)已在类增量学习与任务增量学习问题上进行评估。实验表明AGLA较同类方法取得更优性能,同时提供了COS策略的理论分析。AGLA的源代码、基线算法及实验记录已在\url{https://github.com/anwarmaxsum/AGLA}公开共享供深入研究。