Deploying pretrained policies in real-world applications presents substantial challenges that fundamentally limit the practical applicability of learning-based control systems. When autonomous systems encounter environmental changes in system dynamics, sensor drift, or task objectives, fixed policies rapidly degrade in performance. We show that employing Real-Time Recurrent Reinforcement Learning (RTRRL), a biologically plausible algorithm for online adaptation, can effectively fine-tune a pretrained policy to improve autonomous agents' performance on driving tasks. We further show that RTRRL synergizes with a recent biologically inspired recurrent network model, the Liquid-Resistance Liquid-Capacitance RNN. We demonstrate the effectiveness of this closed-loop approach in a simulated CarRacing environment and in a real-world line-following task with a RoboRacer car equipped with an event camera.
翻译:在现实应用中部署预训练策略面临着根本性限制学习型控制系统实际适用性的重大挑战。当自主系统遭遇系统动力学变化、传感器漂移或任务目标变更时,固定策略的性能会迅速退化。本文证明,采用实时循环强化学习——一种具有生物合理性的在线适应算法——能够有效微调预训练策略,从而提升自主智能体在驾驶任务中的表现。我们进一步揭示,该算法可与近期受生物启发的循环网络模型——液阻液容循环神经网络——产生协同效应。我们通过在模拟CarRacing环境中的实验,以及搭载事件相机的RoboRacer实车线跟踪任务,验证了该闭环方法的有效性。