Navigation is a complex skill with a long history of research in animals and humans. In this work, we simulate the Morris Water Maze in 2D to train deep reinforcement learning agents. We perform automatic classification of navigation strategies, analyze the distribution of strategies used by artificial agents, and compare them with experimental data to show similar learning dynamics as those seen in humans and rodents. We develop environment-specific auxiliary tasks and examine factors affecting their usefulness. We suggest that the most beneficial tasks are potentially more biologically feasible for real agents to use. Lastly, we explore the development of internal representations in the activations of artificial agent neural networks. These representations resemble place cells and head-direction cells found in mouse brains, and their presence has correlation to the navigation strategies that artificial agents employ.
翻译:导航是一种复杂的技能,在动物和人类研究中已有悠久历史。本研究通过二维模拟莫里斯水迷宫,训练深度强化学习智能体。我们实现了导航策略的自动分类,分析了人工智能体采用的策略分布,并将其与实验数据对比,展现了与人类和啮齿动物相似的学习动态。我们开发了环境特定的辅助任务,并探究了影响其有效性的因素,认为最具效益的任务可能更符合真实智能体的生物可行性。最后,我们探索了人工神经网络激活层中内部表征的演化。这些表征与小鼠脑中的位置细胞和头朝向细胞高度相似,且其存在与人工智能体的导航策略具有相关性。