Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. {Finally, we show that specialization by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique.
翻译:先前的研究表明,在持续学习任务中,神经网络存在一种一致趋势:当任务间相似度处于中等水平时,灾难性干扰的程度最高。这一现象被归因于网络倾向于在不同任务间复用已学习的特征。然而,该解释严重依赖于神经元专业化发生的假设,即局部化表征的出现。我们的研究对这一假设的有效性提出了质疑。通过运用神经网络分析的理论框架,我们证明了专业化对初始条件的强烈依赖性。更精确地说,我们证明了权重不平衡和高权重熵有利于专业化解决方案的出现。随后,我们将这些见解应用于持续学习场景中:首先展示了在非专业化网络中,任务相似度与遗忘之间存在单调关系;最后,我们证明了通过权重不平衡实现的专业化对常用的弹性权重巩固正则化技术具有积极影响。