In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent's network suffers from an increasing number of inactive neurons, thereby affecting network expressivity. We demonstrate the presence of this phenomenon across a variety of algorithms and environments, and highlight its effect on learning. To address this issue, we propose a simple and effective method (ReDo) that Recycles Dormant neurons throughout training. Our experiments demonstrate that ReDo maintains the expressive power of networks by reducing the number of dormant neurons and results in improved performance.
翻译:本文揭示了深度强化学习中的休眠神经元现象,即智能体网络因不活跃神经元数量增加而导致网络表达能力下降。我们通过多种算法与环境验证了这一现象的普遍存在,并阐明其对学习过程的影响。针对该问题,我们提出了一种简单有效的方法(ReDo),即在整个训练过程中循环利用休眠神经元。实验证明,ReDo通过减少休眠神经元数量维持了网络的表达能力,从而提升了性能。