Continual learning strives to ensure stability in solving previously seen tasks while demonstrating plasticity in a novel domain. Recent advances in continual learning are mostly confined to a supervised learning setting, especially in NLP domain. In this work, we consider a few-shot continual active learning setting where labeled data are inadequate, and unlabeled data are abundant but with a limited annotation budget. We exploit meta-learning and propose a method, called Meta-Continual Active Learning. This method sequentially queries the most informative examples from a pool of unlabeled data for annotation to enhance task-specific performance and tackle continual learning problems through meta-objective. Specifically, we employ meta-learning and experience replay to address inter-task confusion and catastrophic forgetting. We further incorporate textual augmentations to avoid memory over-fitting caused by experience replay and sample queries, thereby ensuring generalization. We conduct extensive experiments on benchmark text classification datasets from diverse domains to validate the feasibility and effectiveness of meta-continual active learning. We also analyze the impact of different active learning strategies on various meta continual learning models. The experimental results demonstrate that introducing randomness into sample selection is the best default strategy for maintaining generalization in meta-continual learning framework.
翻译:持续学习旨在确保在解决先前任务时保持稳定性,同时在新领域中展现可塑性。当前持续学习的进展主要局限于监督学习场景,尤其是在自然语言处理领域。本研究考虑一种少样本持续主动学习场景,其中标注数据不足,未标注数据丰富但标注预算有限。我们利用元学习提出一种称为元持续主动学习的方法。该方法顺序地从未标注数据池中查询信息量最大的样本进行标注,以提升任务特定性能,并通过元目标解决持续学习问题。具体而言,我们采用元学习和经验回放来解决任务间混淆与灾难性遗忘问题。进一步引入文本增强技术以避免经验回放和样本查询导致的内存过拟合,从而保证泛化能力。我们在跨领域的基准文本分类数据集上进行了大量实验,以验证元持续主动学习的可行性与有效性。同时分析了不同主动学习策略对各类元持续学习模型的影响。实验结果表明,在元持续学习框架中引入样本选择的随机性是维持泛化能力的最佳默认策略。