Drawing parallels between Deep Artificial Neural Networks (DNNs) and biological systems can aid in understanding complex biological mechanisms that are difficult to disentangle. Temporal processing, an extensively researched topic, is one such example that lacks a coherent understanding of its underlying mechanisms. In this study, we investigate temporal processing in a Deep Reinforcement Learning (DRL) agent performing an interval timing task and explore potential biological counterparts to its emergent behavior. The agent was successfully trained to perform a duration production task, which involved marking successive occurrences of a target interval while viewing a video sequence. Analysis of the agent's internal states revealed oscillatory neural activations, a ubiquitous pattern in biological systems. Interestingly, the agent's actions were predominantly influenced by neurons exhibiting these oscillations with high amplitudes and frequencies corresponding to the target interval. Parallels are drawn between the agent's time-keeping strategy and the Striatal Beat Frequency (SBF) model, a biologically plausible model of interval timing. Furthermore, the agent maintained its oscillatory representations and task performance when tested on different video sequences (including a blank video). Thus, once learned, the agent internalized its time-keeping mechanism and showed minimal reliance on its environment to perform the timing task. A hypothesis about the resemblance between this emergent behavior and certain aspects of the evolution of biological processes like circadian rhythms, has been discussed. This study aims to contribute to recent research efforts of utilizing DNNs to understand biological systems, with a particular emphasis on temporal processing.
翻译:在深度人工神经网络与生物系统之间进行类比,有助于理解那些难以厘清的复杂生物机制。时间处理作为一个被广泛研究的课题,正是这样一个对其底层机制缺乏统一认识的例子。本研究调查了执行间隔计时任务的深度强化学习智能体中的时间处理过程,并探讨了其涌现行为与潜在生物对应机制之间的关联。该智能体被成功训练执行持续时间生成任务,该任务要求在观看视频序列时标记目标间隔的连续出现。对智能体内部状态的分析揭示了振荡性神经激活模式——这是生物系统中普遍存在的活动模式。有趣的是,智能体的行为主要受那些呈现高振幅且振荡频率与目标间隔相对应的神经元影响。研究将智能体的时间保持策略与纹状体节拍频率模型(一种具有生物学合理性的间隔计时模型)进行了类比。此外,当使用不同视频序列(包括空白视频)进行测试时,智能体仍能保持其振荡表征和任务性能。这表明一旦习得计时能力,智能体便将时间保持机制内化,在执行计时任务时对环境依赖极低。研究还讨论了关于这种涌现行为与昼夜节律等生物过程演化某些方面相似性的假设。本研究旨在为近期利用深度神经网络理解生物系统的研究作出贡献,并特别关注时间处理领域。