Brain-inspired spiking neural networks (SNNs) have garnered significant research attention in algorithm design and perception applications. However, their potential in the decision-making domain, particularly in model-based reinforcement learning, remains underexplored. The difficulty lies in the need for spiking neurons with long-term temporal memory capabilities, as well as network optimization that can integrate and learn information for accurate predictions. The dynamic dendritic information integration mechanism of biological neurons brings us valuable insights for addressing these challenges. In this study, we propose a multi-compartment neuron model capable of nonlinearly integrating information from multiple dendritic sources to dynamically process long sequential inputs. Based on this model, we construct a Spiking World Model (Spiking-WM), to enable model-based deep reinforcement learning (DRL) with SNNs. We evaluated our model using the DeepMind Control Suite, demonstrating that Spiking-WM outperforms existing SNN-based models and achieves performance comparable to artificial neural network (ANN)-based world models employing Gated Recurrent Units (GRUs). Furthermore, we assess the long-term memory capabilities of the proposed model in speech datasets, including SHD, TIMIT, and LibriSpeech 100h, showing that our multi-compartment neuron model surpasses other SNN-based architectures in processing long sequences. Our findings underscore the critical role of dendritic information integration in shaping neuronal function, emphasizing the importance of cooperative dendritic processing in enhancing neural computation.
翻译:受大脑启发的脉冲神经网络在算法设计和感知应用领域已获得显著的研究关注。然而,其在决策领域,尤其是在基于模型的强化学习中的潜力仍未得到充分探索。难点在于需要具备长期时间记忆能力的脉冲神经元,以及能够整合和学习信息以进行准确预测的网络优化。生物神经元的动态树突信息整合机制为我们应对这些挑战带来了宝贵的启示。在本研究中,我们提出了一种多室神经元模型,该模型能够非线性地整合来自多个树突源的信息,以动态处理长序列输入。基于此模型,我们构建了一个脉冲世界模型,以实现基于脉冲神经网络的深度强化学习。我们使用DeepMind Control Suite评估了我们的模型,结果表明,Spiking-WM优于现有的基于SNN的模型,并达到了与采用门控循环单元的人工神经网络世界模型相当的性能。此外,我们在语音数据集(包括SHD、TIMIT和LibriSpeech 100h)上评估了所提出模型的长期记忆能力,结果显示我们的多室神经元模型在处理长序列方面超越了其他基于SNN的架构。我们的发现强调了树突信息整合在塑造神经元功能中的关键作用,并突出了协作式树突处理在增强神经计算中的重要性。