Deploying pretrained policies in real-world applications presents substantial challenges that fundamentally limit the practical applicability of learning-based control systems. When autonomous systems encounter environmental changes in system dynamics, sensor drift, or task objectives, fixed policies rapidly degrade in performance. We show that employing Real-Time Recurrent Reinforcement Learning (RTRRL), a biologically plausible algorithm for online adaptation, can effectively fine-tune a pretrained policy to improve autonomous agents' performance on driving tasks. We further show that RTRRL synergizes with a recent biologically inspired recurrent network model, the Liquid-Resistance Liquid-Capacitance RNN. We demonstrate the effectiveness of this closed-loop approach in a simulated CarRacing environment and in a real-world line-following task with a RoboRacer car equipped with an event camera.
翻译:在现实世界应用中部署预训练策略面临着重大挑战,这些挑战从根本上限制了基于学习的控制系统的实际适用性。当自主系统遇到系统动力学、传感器漂移或任务目标方面的环境变化时,固定策略的性能会迅速下降。我们证明,采用实时循环强化学习(RTRRL)——一种用于在线适应的生物可信算法——可以有效地微调预训练策略,以提升自主智能体在驾驶任务上的性能。我们进一步表明,RTRRL与近期一种受生物启发的循环网络模型——液阻-液容循环神经网络(Liquid-Resistance Liquid-Capacitance RNN)具有协同效应。我们在模拟的CarRacing环境以及配备事件相机的RoboRacer汽车执行的真实世界循线任务中,验证了这种闭环方法的有效性。