Model-based reinforcement learning (MBRL) techniques have recently yielded promising results for real-world autonomous racing using high-dimensional observations. MBRL agents, such as Dreamer, solve long-horizon tasks by building a world model and planning actions by latent imagination. This approach involves explicitly learning a model of the system dynamics and using it to learn the optimal policy for continuous control over multiple timesteps. As a result, MBRL agents may converge to sub-optimal policies if the world model is inaccurate. To improve state estimation for autonomous racing, this paper proposes a self-supervised sensor fusion technique that combines egocentric LiDAR and RGB camera observations collected from the F1TENTH Gym. The zero-shot performance of MBRL agents is empirically evaluated on unseen tracks and against a dynamic obstacle. This paper illustrates that multimodal perception improves robustness of the world model without requiring additional training data. The resulting multimodal Dreamer agent safely avoided collisions and won the most races compared to other tested baselines in zero-shot head-to-head autonomous racing.
翻译:基于模型的强化学习(MBRL)技术近期在高维观测的现实世界自主赛车任务中取得了令人瞩目的成果。诸如Dreamer等MBRL代理通过构建世界模型并利用潜在想象进行动作规划来解决长时域任务。该方法显式学习系统动态模型,并利用该模型学习跨多个时间步的连续控制最优策略。因此,若世界模型不准确,MBRL代理可能收敛到次优策略。为改善自主赛车的状态估计,本文提出一种自监督传感器融合技术,融合来自F1TENTH Gym的以自我为中心的激光雷达与RGB相机观测。实验在未见赛道及动态障碍物场景下,对MBRL代理的零样本性能进行实证评估。本文表明,多模态感知无需额外训练数据即可提升世界模型的鲁棒性。最终的多模态Dreamer代理在零样本头对头自主赛车中成功避免碰撞,并相较其他测试基准赢得最多场比赛。