In this paper, we propose a method for incremental learning of two distinct tasks over time: acoustic scene classification (ASC) and audio tagging (AT). We use a simple convolutional neural network (CNN) model as an incremental learner to solve the tasks. Generally, incremental learning methods catastrophically forget the previous task when sequentially trained on a new task. To alleviate this problem, we propose independent learning and knowledge distillation (KD) between the timesteps in learning. Experiments are performed on TUT 2016/2017 dataset, containing 4 acoustic scene classes and 25 sound event classes. The proposed incremental learner first solves the ASC task with an accuracy of 94.0%. Next, it learns to solve the AT task with an F1 score of 54.4%. At the same time, its performance on the previous ASC task decreases only by 5.1 percentage points due to the additional learning of the AT task.
翻译:本文提出一种随时间推移对两项不同任务进行增量学习的方法:声学场景分类(ASC)与音频标注(AT)。我们采用简单的卷积神经网络(CNN)模型作为增量学习器来解决这些任务。通常,增量学习方法在依次训练新任务时会灾难性地遗忘先前任务。为缓解这一问题,我们提出在学习过程中不同时间步之间采用独立学习与知识蒸馏(KD)策略。实验基于包含4类声学场景和25类声音事件的TUT 2016/2017数据集进行。所提出的增量学习器首先以94.0%的准确率解决ASC任务,随后以54.4%的F1分数学习解决AT任务。同时,由于AT任务的额外学习,其在先前ASC任务上的性能仅下降5.1个百分点。