Predictive map theory, one of the theories explaining spatial learning in animals, is based on successor representation (SR) learning algorithms. In the real world, agents such as animals and robots are subjected to noisy observations, which can lead to suboptimal actions or even failure during learning. In this study, we compared the performance of Successor Features (SFs) and Predecessor Features (PFs) algorithms in a noisy one-dimensional maze environment. Our results demonstrated that PFs consistently outperformed SFs in terms of cumulative reward and average step length, with higher resilience to noise. This superiority could be due to PFs' ability to transmit temporal difference errors to more preceding states. We also discuss the biological mechanisms involved in PFs learning for spatial navigation. This study contributes to the theoretical research on computational neuroscience using reinforcement learning algorithms, and highlights the practical potential of PFs in robotics, game AI, and autonomous vehicle navigation.
翻译:预测地图理论(Predictive Map Theory)是解释动物空间学习行为的理论之一,其基础是后继表征(Successor Representation, SR)学习算法。在现实世界中,动物和机器人等智能体会受到噪声观测的影响,这可能导致学习过程中产生次优行为甚至失败。本研究在含噪声的一维迷宫环境中,比较了后继特征(Successor Features, SFs)与前驱特征(Predecessor Features, PFs)算法的性能。实验结果表明,PFs在累积奖励和平均步长两项指标上始终优于SFs,且对噪声具有更强的鲁棒性。这一优势可能源于PFs能够将时序差分误差传递至更多先前状态的能力。我们还探讨了PFs学习在空间导航中所涉及的生物学机制。本研究利用强化学习算法推进了计算神经科学的理论研究,并凸显了PFs在机器人、游戏人工智能和自动驾驶导航领域的实际应用潜力。