The conditional diffusion model has been demonstrated as an efficient tool for learning robot policies, owing to its advancement to accurately model the conditional distribution of policies. The intricate nature of real-world scenarios, characterized by dynamic obstacles and maze-like structures, underscores the complexity of robot local navigation decision-making as a conditional distribution problem. Nevertheless, leveraging the diffusion model for robot local navigation is not trivial and encounters several under-explored challenges: (1) Data Urgency. The complex conditional distribution in local navigation needs training data to include diverse policy in diverse real-world scenarios; (2) Myopic Observation. Due to the diversity of the perception scenarios, diffusion decisions based on the local perspective of robots may prove suboptimal for completing the entire task, as they often lack foresight. In certain scenarios requiring detours, the robot may become trapped. To address these issues, our approach begins with an exploration of a diverse data generation mechanism that encompasses multiple agents exhibiting distinct preferences through target selection informed by integrated global-local insights. Then, based on this diverse training data, a diffusion agent is obtained, capable of excellent collision avoidance in diverse scenarios. Subsequently, we augment our Local Diffusion Planner, also known as LDP by incorporating global observations in a lightweight manner. This enhancement broadens the observational scope of LDP, effectively mitigating the risk of becoming ensnared in local optima and promoting more robust navigational decisions.
翻译:条件扩散模型已被证明是学习机器人策略的有效工具,这得益于其在精确建模策略条件分布方面的进展。现实场景的复杂性——以动态障碍物和迷宫式结构为特征——凸显了机器人局部导航决策作为条件分布问题的复杂性。然而,利用扩散模型进行机器人局部导航并非易事,并面临几个尚未充分探索的挑战:(1) 数据紧迫性。局部导航中复杂的条件分布要求训练数据包含多样化现实场景下的多样化策略;(2) 短视观测。由于感知场景的多样性,基于机器人局部视角的扩散决策可能被证明对于完成整个任务是次优的,因为它们通常缺乏前瞻性。在某些需要绕行的场景中,机器人可能会陷入困境。为解决这些问题,我们的方法首先探索了一种多样化数据生成机制,该机制通过融合全局-局部信息指导的目标选择,涵盖了具有不同偏好的多个智能体。然后,基于这些多样化的训练数据,我们获得了一个扩散智能体,能够在多样化场景中实现出色的避障。随后,我们通过以轻量级方式融入全局观测,增强了我们的局部扩散规划器(亦称为LDP)。这一增强拓宽了LDP的观测范围,有效降低了陷入局部最优的风险,并促进了更鲁棒的导航决策。