Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems. Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems, but the mechanisms driving this phenomenon are still poorly understood. This paper conducts a systematic empirical analysis into plasticity loss, with the goal of understanding the phenomenon mechanistically in order to guide the future development of targeted solutions. We find that loss of plasticity is deeply connected to changes in the curvature of the loss landscape, but that it typically occurs in the absence of saturated units or divergent gradient norms. Based on this insight, we identify a number of parameterization and optimization design choices which enable networks to better preserve plasticity over the course of training. We validate the utility of these findings in larger-scale learning problems by applying the best-performing intervention, layer normalization, to a deep RL agent trained on the Arcade Learning Environment.
翻译:可塑性是神经网络依据新信息快速调整其预测的能力,这对深度强化学习系统的适应性和鲁棒性至关重要。已知深度神经网络即使在相对简单的学习问题中,也会在训练过程中丧失可塑性,但驱动这一现象的机制仍鲜为人知。本文对可塑性丧失进行了系统性实证分析,旨在从机制层面理解该现象,从而为未来针对性解决方案的开发提供指导。我们发现可塑性丧失与损失函数曲率变化密切相关,但其通常发生在未出现饱和单元或梯度范数发散的情况下。基于这一发现,我们识别出若干参数化与优化设计选择,这些选择能使网络在训练过程中更好地保持可塑性。我们通过将性能最佳的干预手段——层归一化——应用于在Arcade学习环境中训练的深度强化学习智能体,验证了这些发现在更大规模学习问题中的实用性。