The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks. This issue is critical in practical supervised learning settings, such as the ones in which a pre-trained model computes projections toward a latent space where different task predictors are sequentially learned over time. As a matter of fact, incrementally fine-tuning the whole model to better adapt to new tasks usually results in catastrophic forgetting, with decreasing performance over the past experiences and losing valuable knowledge from the pre-training stage. In this paper, we propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters that are responsible for transforming the input data. This process allows the network to effectively leverage the pre-training knowledge and find a good trade-off between plasticity and stability with modest computational efforts, thus especially suitable for on-the-edge settings. Our experiments on four image classification problems in a continual learning setting confirm the quality of the proposed approach when compared to several fine-tuning procedures and to popular continual learning methods.
翻译:适应非平稳环境的内在困难限制了深度学习模型在现实任务中的可部署性,这一问题在典型监督学习场景中尤为突出——例如使用预训练模型将数据投影至潜在空间,再按时间顺序依次学习不同任务预测器的场景。事实上,增量微调整个模型以更好地适应新任务通常会导致灾难性遗忘,模型在过往任务上的性能持续衰减,并丧失预训练阶段获得的重要知识。本文提出一种新颖策略来提升微调流程的有效性:通过避免更新网络预训练部分,不仅学习常规分类头,还引入一组负责输入数据变换的可学习参数。该方法使网络能够有效利用预训练知识,以较低计算开销实现可塑性-稳定性平衡,特别适用于边缘计算场景。在持续学习框架下对四个图像分类任务的实验证实,相较于多种微调流程与主流持续学习方法,本文提出的方法性能优越。