Continual learning has recently attracted attention from the research community, as it aims to solve long-standing limitations of classic supervisedly-trained models. However, most research on this subject has tackled continual learning in simple image classification scenarios. In this paper, we present a benchmark of state-of-the-art continual learning methods on video action recognition. Besides the increased complexity due to the temporal dimension, the video setting imposes stronger requirements on computing resources for top-performing rehearsal methods. To counteract the increased memory requirements, we present two method-agnostic variants for rehearsal methods, exploiting measures of either model confidence or data information to select memorable samples. Our experiments show that, as expected from the literature, rehearsal methods outperform other approaches; moreover, the proposed memory-efficient variants are shown to be effective at retaining a certain level of performance with a smaller buffer size.
翻译:持续学习近年来引起了研究界的关注,因为它旨在解决传统监督训练模型长期存在的局限性。然而,该领域的大多数研究仅针对简单的图像分类场景探讨持续学习。本文针对视频动作识别任务,提出了一种当前最先进持续学习方法的基准测试。除了因时间维度带来的复杂性增加,视频场景对高性能重放方法的计算资源提出了更高要求。为应对内存需求的增长,我们提出了两种与具体方法无关的重放方法变体,分别利用模型置信度或数据信息量指标来选择具有记忆价值的样本。实验表明,与文献研究预期一致,重放方法的性能优于其他方法;此外,所提出的内存高效变体在较小缓冲区大小下仍能有效保持一定的性能水平。