Schemas are knowledge structures that can enable one-shot learning. Rodent one-shot learning in a multiple paired association navigation task has been postulated to be schema-dependent. However, the correspondence between schemas and neural implementations remains poorly understood, and a biologically plausible computational model of the rodents learning has not been demonstrated. Here, we compose such an agent from schemas with biologically plausible neural implementations. The agent contains an associative memory that can form one-shot associations between sensory cues and goal coordinates, implemented using a network with either a feedforward layer or a reservoir of recurrently connected neurons whose plastic output weights are governed by a novel 4-factor reward modulated Exploratory Hebbian (EH) rule. Adding an actor-critic allows the agent to succeed even if obstacles prevent direct heading. With the addition of working memory, the rodent behavior is replicated. Temporal-difference learning of a working memory gate enables one-shot learning despite distractors.
翻译:图式是一种能够实现一次学习的知识结构。啮齿动物在多重配对关联导航任务中的一次学习被认为依赖于图式。然而,图式与神经实现之间的对应关系仍不明确,且尚未有生物合理的计算模型能够复现啮齿动物的学习过程。本文基于具有生物合理神经实现的图式构建了这样一个智能体。该智能体包含一个联想记忆模块,能够通过感官线索与目标坐标之间的一次性关联实现学习。该模块采用前馈层或循环连接神经元储层网络实现,其可塑性输出权重由一种新颖的四因子奖励调制探索型赫布(EH)规则控制。通过加入行动者-评论家机制,即使存在障碍物阻碍直接朝向目标,智能体也能成功导航。进一步引入工作记忆后,可复现啮齿动物的行为模式。对工作记忆门控的时序差分学习使智能体能够在干扰存在的情况下实现一次学习。