Navigation is a complex skill with a long history of research in animals and humans. In this work, we simulate the Morris Water Maze in 2D to train deep reinforcement learning agents. We perform automatic classification of navigation strategies, analyze the distribution of strategies used by artificial agents, and compare them with experimental data to show similar learning dynamics as those seen in humans and rodents. We develop environment-specific auxiliary tasks and examine factors affecting their usefulness. We suggest that the most beneficial tasks are potentially more biologically feasible for real agents to use. Lastly, we explore the development of internal representations in the activations of artificial agent neural networks. These representations resemble place cells and head-direction cells found in mouse brains, and their presence has correlation to the navigation strategies that artificial agents employ.
翻译:导航是一项在动物和人类研究中历史悠久且复杂的技能。本研究通过二维模拟莫里斯水迷宫环境,训练深度强化学习智能体。我们实现了导航策略的自动化分类,分析了人工智能体所采用策略的分布特征,并将结果与实验数据对比,揭示了与人类和啮齿类动物相似的学习动力学规律。我们开发了环境特定辅助任务,并考察了影响其效用的因素,结果表明最具增益价值的任务可能更符合真实智能体的生物学可行性。此外,我们探究了人工神经网络激活中的内部表征演化过程。这些表征与小鼠脑内的位置细胞和头朝向细胞具有相似性,且其存在性与智能体采用的导航策略存在相关性。