As a general method for exploration in deep reinforcement learning (RL), NoisyNet can produce problem-specific exploration strategies. Spiking neural networks (SNNs), due to their binary firing mechanism, have strong robustness to noise, making it difficult to realize efficient exploration with local disturbances. To solve this exploration problem, we propose a noisy spiking actor network (NoisySAN) that introduces time-correlated noise during charging and transmission. Moreover, a noise reduction method is proposed to find a stable policy for the agent. Extensive experimental results demonstrate that our method outperforms the state-of-the-art performance on a wide range of continuous control tasks from OpenAI gym.
翻译:作为深度强化学习中一种通用的探索方法,噪声网络(NoisyNet)能够生成针对特定问题的探索策略。脉冲神经网络(SNNs)由于其二值发放机制,对噪声具有较强的鲁棒性,这使得通过局部扰动实现高效探索变得困难。为了解决这一探索问题,我们提出了一种噪声脉冲动作网络(NoisySAN),该网络在充电和传输过程中引入时间相关噪声。此外,还提出了一种降噪方法,用于帮助智能体找到稳定策略。大量实验结果表明,我们的方法在OpenAI gym的广泛连续控制任务上超越了当前最先进的性能。