In many real-world applications, deep neural networks are retrained from scratch as a dataset grows in size. Given the computational expense for retraining networks, it has been argued that continual learning could make updating networks more efficient. An obstacle to achieving this goal is the stability gap, which refers to an observation that when updating on new data, performance on previously learned data degrades before recovering. Addressing this problem would enable learning new data with fewer network updates, resulting in increased computational efficiency. We study how to mitigate the stability gap. We test a variety of hypotheses to understand why the stability gap occurs. This leads us to discover a method that vastly reduces this gap. In large-scale class incremental learning experiments, we are able to significantly reduce the number of network updates needed for continual learning. Our work has the potential to advance the state-of-the-art in continual learning for real-world applications along with reducing the carbon footprint required to maintain updated neural networks.
翻译:在许多现实应用中,随着数据集的增长,深度神经网络需要从头开始重新训练。考虑到重新训练网络的计算成本,持续学习被认为可以更高效地更新网络。实现这一目标的障碍在于“稳定性鸿沟”——即当网络在新数据上进行更新时,先前学习数据的性能会出现先下降后恢复的现象。解决这个问题将有助于通过更少的网络更新来学习新数据,从而提高计算效率。我们研究了如何缓解稳定性鸿沟,通过测试多种假说来理解其成因,最终发现了一种能大幅缩小这一鸿沟的方法。在大规模类别增量学习实验中,我们显著减少了持续学习所需的网络更新次数。这项工作有望推动面向现实应用的持续学习技术发展,同时降低维护更新神经网络所需的碳排放。