Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems. Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems, but the mechanisms driving this phenomenon are still poorly understood. This paper conducts a systematic empirical analysis into plasticity loss, with the goal of understanding the phenomenon mechanistically in order to guide the future development of targeted solutions. We find that loss of plasticity is deeply connected to changes in the curvature of the loss landscape, but that it often occurs in the absence of saturated units. Based on this insight, we identify a number of parameterization and optimization design choices which enable networks to better preserve plasticity over the course of training. We validate the utility of these findings on larger-scale RL benchmarks in the Arcade Learning Environment.
翻译:可塑性,即神经网络快速响应新信息以改变其预测的能力,对于深度强化学习系统的适应性和鲁棒性至关重要。已知深度神经网络即使在相对简单的学习问题中,也会在训练过程中丧失可塑性,但驱动这一现象背后的机制仍不明确。本文对可塑性丧失进行了系统的经验性分析,旨在从机制上理解这一现象,从而指导未来针对性解决方案的研发。我们发现,可塑性丧失与损失景观曲率的变化密切相关,但这一现象往往在饱和单元缺失时依然发生。基于这一洞见,我们识别出多种参数化和优化设计选择,这些选择能使网络在训练过程中更好地保持可塑性。我们通过在街机学习环境中的更大规模强化学习基准测试中验证了这些发现的实用性。