This paper presents a self-supervised learning method to safely learn a motion planner for ground robots to navigate environments with dense and dynamic obstacles. When facing highly-cluttered, fast-moving, hard-to-predict obstacles, classical motion planners may not be able to keep up with limited onboard computation. For learning-based planners, high-quality demonstrations are difficult to acquire for imitation learning while reinforcement learning becomes inefficient due to the high probability of collision during exploration. To safely and efficiently provide training data, the Learning from Hallucination (LfH) approaches synthesize difficult navigation environments based on past successful navigation experiences in relatively easy or completely open ones, but unfortunately cannot address dynamic obstacles. In our new Dynamic Learning from Learned Hallucination (Dyna-LfLH), we design and learn a novel latent distribution and sample dynamic obstacles from it, so the generated training data can be used to learn a motion planner to navigate in dynamic environments. Dyna-LfLH is evaluated on a ground robot in both simulated and physical environments and achieves up to 25% better success rate compared to baselines.
翻译:本文提出一种自监督学习方法,使地面机器人能够在密集动态障碍物环境中安全地学习运动规划器。当面临高度杂乱、快速移动且难以预测的障碍物时,经典运动规划器可能因板载计算资源有限而无法及时响应。对于基于学习的规划器而言,模仿学习难以获取高质量示范数据,而强化学习则会因探索过程中碰撞概率过高导致效率低下。为安全高效地提供训练数据,"从幻觉中学习"(LfH)方法通过基于相对简单或完全空旷环境中的过往成功导航经验合成困难导航场景,但遗憾的是无法处理动态障碍物。我们提出的新型动态学习幻觉(Dyna-LfLH)方法设计并学习了一种新颖的隐分布,并从中采样动态障碍物,从而使生成的训练数据可用于学习能够在动态环境中导航的运动规划器。在仿真与物理环境中的地面机器人评估实验表明,相比基线方法,Dyna-LfLH的成功率提升最高达25%。