Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision-making on neuromorphic hardware by mimicking the event-driven dynamics of biological neurons. However, due to the discrete and non-differentiable nature of spikes, directly trained SNNs rely heavily on Batch Normalization (BN) to stabilize gradient updates. In online Reinforcement Learning (RL), imprecise BN statistics hinder exploitation, resulting in slower convergence and suboptimal policies. This challenge limits the adoption of SNNs for energy-efficient control on resource-constrained devices. To overcome this, we propose Confidence-adaptive and Re-calibration Batch Normalization (CaRe-BN), which introduces (\emph{i}) a confidence-guided adaptive update strategy for BN statistics and (\emph{ii}) a re-calibration mechanism to align distributions. By providing more accurate normalization, CaRe-BN stabilizes SNN optimization without disrupting the RL training process. Importantly, CaRe-BN does not alter inference, thus preserving the energy efficiency of SNNs in deployment. Extensive experiments on continuous control benchmarks demonstrate that CaRe-BN improves SNN performance by up to $22.6\%$ across different spiking neuron models and RL algorithms. Remarkably, SNNs equipped with CaRe-BN even surpass their ANN counterparts by $5.9\%$. These results highlight a new direction for BN techniques tailored to RL, paving the way for neuromorphic agents that are both efficient and high-performing.
翻译:脉冲神经网络(SNNs)通过模拟生物神经元的事件驱动动力学,可在神经形态硬件上实现低延迟、高能效的决策。然而,由于脉冲的离散性和不可微性,直接训练的SNN严重依赖批归一化(BN)来稳定梯度更新。在在线强化学习(RL)中,不精确的BN统计量会阻碍策略利用,导致收敛速度减慢和策略性能欠佳。这一挑战限制了SNN在资源受限设备上实现高能效控制的应用。为克服此问题,我们提出置信度自适应与重校准批归一化(CaRe-BN),其引入了(\emph{i})基于置信度引导的BN统计量自适应更新策略,以及(\emph{ii})用于分布对齐的重校准机制。通过提供更精确的归一化,CaRe-BN能在不干扰RL训练过程的前提下稳定SNN的优化。重要的是,CaRe-BN不改变推理过程,从而保持了SNN在部署时的能效优势。在连续控制基准测试上的大量实验表明,CaRe-BN可将不同脉冲神经元模型和RL算法下的SNN性能提升高达$22.6\%$。值得注意的是,配备CaRe-BN的SNN甚至能超越其人工神经网络(ANN)对应模型$5.9\%$。这些结果为面向RL的BN技术指明了新方向,为构建兼具高效能与高能效的神经形态智能体铺平了道路。