In recent years, spiking neural networks (SNNs) have been used in reinforcement learning (RL) due to their low power consumption and event-driven features. However, spiking reinforcement learning (SRL), which suffers from fixed coding methods, still faces the problems of high latency and poor versatility. In this paper, we use learnable matrix multiplication to encode and decode spikes, improving the flexibility of the coders and thus reducing latency. Meanwhile, we train the SNNs using the direct training method and use two different structures for online and offline RL algorithms, which gives our model a wider range of applications. Extensive experiments have revealed that our method achieves optimal performance with ultra-low latency (as low as 0.8% of other SRL methods) and excellent energy efficiency (up to 5X the DNNs) in different algorithms and different environments.
翻译:近年来,脉冲神经网络(SNNs)因其低功耗和事件驱动特性被应用于强化学习(RL)。然而,受限于固定编码方法的脉冲强化学习(SRL)仍面临高延迟和通用性差的问题。本文采用可学习矩阵乘法进行脉冲编码与解码,提升了编码器的灵活性,从而降低了延迟。同时,我们采用直接训练方法训练SNNs,并针对在线与离线RL算法使用两种不同结构,使模型具有更广泛的应用场景。大量实验表明,本方法能在超低延迟(低至其他SRL方法的0.8%)和卓越能效(最高可达DNN的5倍)下,在不同算法和环境中实现最优性能。