This work studies how brain-inspired neural ensembles equipped with local Hebbian plasticity can perform active inference (AIF) in order to control dynamical agents. A generative model capturing the environment dynamics is learned by a network composed of two distinct Hebbian ensembles: a posterior network, which infers latent states given the observations, and a state transition network, which predicts the next expected latent state given current state-action pairs. Experimental studies are conducted using the Mountain Car environment from the OpenAI gym suite, to study the effect of the various Hebbian network parameters on the task performance. It is shown that the proposed Hebbian AIF approach outperforms the use of Q-learning, while not requiring any replay buffer, as in typical reinforcement learning systems. These results motivate further investigations of Hebbian learning for the design of AIF networks that can learn environment dynamics without the need for revisiting past buffered experiences.
翻译:本文研究受大脑启发的神经集群如何利用局部赫布可塑性执行主动推理(AIF),以控制动力学智能体。一个由两个不同赫布集群组成的网络学习捕获环境动态的生成模型:后验网络根据观测推断隐状态,状态转移网络则根据当前状态-动作对预测下一期望隐状态。基于OpenAI gym套件中的Mountain Car环境开展实验研究,分析赫布网络各参数对任务性能的影响。研究表明,所提出的赫布主动推理方法在无需像典型强化学习系统那样使用重放缓冲区的情况下,其性能优于Q学习方法。这些结果激励我们进一步探索将赫布学习应用于主动推理网络设计,使其无需参考以往缓冲经验即可学习环境动态。