Current attempts of Reinforcement Learning for Autonomous Controller are data-demanding while the results are under-performed, unstable, and unable to grasp and anchor on the concept of safety, and over-concentrating on noise features due to the nature of pixel reconstruction. While current Self-Supervised Learningapproachs that learning on high-dimensional representations by leveraging the JointEmbedding Predictive Architecture (JEPA) are interesting and an effective alternative, as the idea mimics the natural ability of the human brain in acquiring new skill usingimagination and minimal samples of observations. This study introduces Hanoi-World, a JEPA-based world model that using recurrent neural network (RNN) formaking longterm horizontal planning with effective inference time. Experimentsconducted on the Highway-Env package with difference enviroment showcase the effective capability of making a driving plan while safety-awareness, with considerablecollision rate in comparison with SOTA baselines
翻译:当前基于强化学习的自动驾驶控制器方法存在数据需求量大、性能表现不佳、稳定性差、难以把握并锚定安全概念,以及因像素重建本质而过度关注噪声特征等问题。与此同时,当前利用联合嵌入预测架构在高维表示上进行学习的自监督学习方法展现出有趣且有效的替代潜力,其思想模拟了人脑通过想象和少量观察样本获取新技能的自然能力。本研究提出了HanoiWorld,一种基于JEPA的世界模型,该模型利用循环神经网络进行长期水平规划,并具备高效的推理时间。在Highway-Env套件的不同环境中进行的实验表明,该模型在具备安全意识的驾驶规划方面展现出有效能力,与当前最先进的基线模型相比,其碰撞率显著降低。