There are two major challenges for scaling up robot navigation around dynamic obstacles: the complex interaction dynamics of the obstacles can be hard to model analytically, and the complexity of planning and control grows exponentially in the number of obstacles. Data-driven and learning-based methods are thus particularly valuable in this context. However, data-driven methods are sensitive to distribution drift, making it hard to train and generalize learned models across different obstacle densities. We propose a novel method for compositional learning of Sequential Neural Control Barrier models (SNCBFs) to achieve scalability. Our approach exploits an important observation: the spatial interaction patterns of multiple dynamic obstacles can be decomposed and predicted through temporal sequences of states for each obstacle. Through decomposition, we can generalize control policies trained only with a small number of obstacles, to environments where the obstacle density can be 100x higher. We demonstrate the benefits of the proposed methods in improving dynamic collision avoidance in comparison with existing methods including potential fields, end-to-end reinforcement learning, and model-predictive control. We also perform hardware experiments and show the practical effectiveness of the approach in the supplementary video.
翻译:扩展机器人导航在动态障碍物环境中面临两大挑战:障碍物复杂的交互动力学难以解析建模,且规划与控制的复杂度随障碍物数量呈指数级增长。数据驱动与基于学习的方法在此背景下尤为重要。然而,数据驱动方法对分布漂移敏感,导致在不同障碍物密度下训练和泛化学习模型困难。我们提出一种新颖的序列神经控制屏障模型(SNCBFs)组合学习方法以实现可扩展性。该方法基于重要发现:多个动态障碍物的空间交互模式可通过每个障碍物的状态时间序列进行分解与预测。通过分解,可将仅基于少量障碍物训练的控制策略泛化至障碍物密度高达100倍的环境。我们通过实验证明了所提方法在改进动态碰撞规避方面的优势,并与势场法、端到端强化学习和模型预测控制等现有方法进行对比。硬件实验及补充视频进一步验证了该方法的实际有效性。