As a general method for exploration in deep reinforcement learning (RL), NoisyNet can produce problem-specific exploration strategies. Spiking neural networks (SNNs), due to their binary firing mechanism, have strong robustness to noise, making it difficult to realize efficient exploration with local disturbances. To solve this exploration problem, we propose a noisy spiking actor network (NoisySAN) that introduces time-correlated noise during charging and transmission. Moreover, a noise reduction method is proposed to find a stable policy for the agent. Extensive experimental results demonstrate that our method outperforms the state-of-the-art performance on a wide range of continuous control tasks from OpenAI gym.
翻译:作为深度强化学习(RL)中一种通用的探索方法,NoisyNet能够生成针对特定问题的探索策略。脉冲神经网络(SNNs)由于其二元发放机制,对噪声具有较强的鲁棒性,这使得通过局部扰动实现高效探索变得困难。为解决这一探索问题,我们提出了一种噪声脉冲执行器网络(NoisySAN),该网络在充电和传输过程中引入时间相关噪声。此外,我们还提出了一种降噪方法,旨在为智能体寻找稳定的策略。大量的实验结果表明,我们的方法在OpenAI gym的一系列连续控制任务上超越了现有最先进方法的性能。