As the third-generation neural networks, Spiking Neural Networks (SNNs) have great potential on neuromorphic hardware because of their high energy-efficiency. However, Deep Spiking Reinforcement Learning (DSRL), i.e., the Reinforcement Learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the non-differentiable property of the spiking function. To address these issues, we propose a Deep Spiking Q-Network (DSQN) in this paper. Specifically, we propose a directly-trained deep spiking reinforcement learning architecture based on the Leaky Integrate-and-Fire (LIF) neurons and Deep Q-Network (DQN). Then, we adapt a direct spiking learning algorithm for the Deep Spiking Q-Network. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, robustness and energy-efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly-trained SNN.
翻译:作为第三代神经网络,脉冲神经网络(SNN)因其高能效特性在神经形态硬件上具有巨大潜力。然而,基于SNN的深度脉冲强化学习(DSRL)由于脉冲函数的二值输出和不可微特性仍处于初步阶段。为解决这些问题,本文提出一种深度脉冲Q网络(DSQN)。具体而言,我们基于泄漏整合放电(LIF)神经元和深度Q网络(DQN)构建了直接训练的深度脉冲强化学习架构,并为深度脉冲Q网络适配了直接脉冲学习算法。我们从理论上进一步论证了在DSQN中使用LIF神经元的优势。我们在17个顶级Atari游戏上进行了全面实验,将我们的方法与最先进的转换方法进行比较。实验结果表明,我们的方法在性能、稳定性、鲁棒性和能效方面均具有优越性。据我们所知,本文首次通过直接训练的SNN在多个Atari游戏上实现了最先进的性能。