The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks. This issue is critical in practical supervised learning settings, such as the ones in which a pre-trained model computes projections toward a latent space where different task predictors are sequentially learned over time. As a matter of fact, incrementally fine-tuning the whole model to better adapt to new tasks usually results in catastrophic forgetting, with decreasing performance over the past experiences and losing valuable knowledge from the pre-training stage. In this paper, we propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters that are responsible for transforming the input data. This process allows the network to effectively leverage the pre-training knowledge and find a good trade-off between plasticity and stability with modest computational efforts, thus especially suitable for on-the-edge settings. Our experiments on four image classification problems in a continual learning setting confirm the quality of the proposed approach when compared to several fine-tuning procedures and to popular continual learning methods.
翻译:深度学习模型在非平稳环境中的适应性困难限制了神经网络在现实任务中的应用。这一问题在实践性监督学习场景中尤为突出,例如利用预训练模型计算潜在空间投影,并在该空间中按时间顺序依次学习不同任务预测器的情况。事实上,增量式微调整个模型以更好适应新任务通常会导致灾难性遗忘——历史任务性能持续下降,且预训练阶段积累的宝贵知识逐渐丧失。本文提出一种提升微调过程有效性的新策略:通过避免更新网络预训练部分,不仅学习常规分类头,还学习一组负责输入数据变换的新引入可学习参数。该过程使网络能够有效利用预训练知识,以较低计算代价实现可塑性与稳定性的良好平衡,尤其适合边缘计算场景。我们在四个图像分类问题的持续学习实验证实,相较多种微调流程及主流持续学习方法,所提方法具有优异性能。