CaRe-BN: Precise Moving Statistics for Stabilizing Spiking Neural Networks in Reinforcement Learning

Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision-making on neuromorphic hardware by mimicking the event-driven dynamics of biological neurons. However, the discrete and non-differentiable nature of spikes leads to unstable gradient propagation in directly trained SNNs, making Batch Normalization (BN) an important component for stabilizing training. In online Reinforcement Learning (RL), imprecise BN statistics hinder exploitation, resulting in slower convergence and suboptimal policies. While Artificial Neural Networks (ANNs) can often omit BN, SNNs critically depend on it, limiting the adoption of SNNs for energy-efficient control on resource-constrained devices. To overcome this, we propose Confidence-adaptive and Re-calibration Batch Normalization (CaRe-BN), which introduces (i) a confidence-guided adaptive update strategy for BN statistics and (ii) a re-calibration mechanism to align distributions. By providing more accurate normalization, CaRe-BN stabilizes SNN optimization without disrupting the RL training process. Importantly, CaRe-BN does not alter inference, thus preserving the energy efficiency of SNNs in deployment. Extensive experiments on both discrete and continuous control benchmarks demonstrate that CaRe-BN improves SNN performance by up to $22.6\%$ across different spiking neuron models and RL algorithms. Remarkably, SNNs equipped with CaRe-BN even surpass their ANN counterparts by $5.9\%$. These results highlight a new direction for BN techniques tailored to RL, paving the way for neuromorphic agents that are both efficient and high-performing. Code is available at https://github.com/xuzijie32/CaRe-BN.

翻译：脉冲神经网络（SNNs）通过模拟生物神经元的事件驱动动力学，在神经形态硬件上实现了低延迟和节能的决策。然而，脉冲的离散和非可微特性导致直接训练的SNNs中梯度传播不稳定，使得批归一化（BN）成为稳定训练的关键组件。在在线强化学习（RL）中，不精确的BN统计量会阻碍利用，导致收敛速度变慢和策略次优。虽然人工神经网络（ANNs）通常可以省略BN，但SNNs严重依赖它，这限制了SNNs在资源受限设备上用于节能控制的采用。为克服此问题，我们提出了置信度自适应与重校准批归一化（CaRe-BN），它引入了（i）一种基于置信度引导的自适应更新策略用于BN统计量，以及（ii）一种用于对齐分布的重校准机制。通过提供更精确的归一化，CaRe-BN在不干扰RL训练过程的情况下稳定了SNN的优化。重要的是，CaRe-BN不改变推理过程，从而保持了SNNs在部署时的能效。在离散和连续控制基准测试上的大量实验表明，CaRe-BN在不同的脉冲神经元模型和RL算法上将SNN性能提升了高达$22.6\%$。值得注意的是，配备CaRe-BN的SNNs甚至超越了其对应的ANNs，性能高出$5.9\%$。这些结果突显了为RL量身定制BN技术的新方向，为开发既高效又高性能的神经形态智能体铺平了道路。代码可在 https://github.com/xuzijie32/CaRe-BN 获取。