Replay is a powerful strategy to promote learning in artificial intelligence and the brain. However, the conditions to generate it and its functional advantages have not been fully recognized. In this study, we develop a modular reinforcement learning model that could generate replay. We prove that replay generated in this way helps complete the task. We also analyze the information contained in the representation and provide a mechanism for how replay makes a difference. Our design avoids complex assumptions and enables replay to emerge naturally within a task-optimized paradigm. Our model also reproduces key phenomena observed in biological agents. This research explores the structural biases in modular ANN to generate replay and its potential utility in developing efficient RL.
翻译:回放是促进人工智能与大脑学习的一种有效策略。然而,其生成条件与功能优势尚未得到充分认识。本研究构建了一个能够产生回放的模块化强化学习模型。我们证明以此方式生成的回放有助于任务完成。同时,我们分析了表征中包含的信息,并阐释了回放产生影响的机制。我们的设计避免了复杂的假设,使回放能够在任务优化范式中自然涌现。该模型还复现了在生物智能体中观察到的关键现象。本研究探索了模块化人工神经网络中产生回放的结构性偏好,及其在开发高效强化学习算法中的潜在应用价值。