GP-guided MPPI for Efficient Navigation in Complex Unknown Cluttered Environments

from arxiv, This paper has 8 pages, 6 figures, 2 tables. It has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, Michigan, USA, 2023

Robotic navigation in unknown, cluttered environments with limited sensing capabilities poses significant challenges in robotics. Local trajectory optimization methods, such as Model Predictive Path Intergal (MPPI), are a promising solution to this challenge. However, global guidance is required to ensure effective navigation, especially when encountering challenging environmental conditions or navigating beyond the planning horizon. This study presents the GP-MPPI, an online learning-based control strategy that integrates MPPI with a local perception model based on Sparse Gaussian Process (SGP). The key idea is to leverage the learning capability of SGP to construct a variance (uncertainty) surface, which enables the robot to learn about the navigable space surrounding it, identify a set of suggested subgoals, and ultimately recommend the optimal subgoal that minimizes a predefined cost function to the local MPPI planner. Afterward, MPPI computes the optimal control sequence that satisfies the robot and collision avoidance constraints. Such an approach eliminates the necessity of a global map of the environment or an offline training process. We validate the efficiency and robustness of our proposed control strategy through both simulated and real-world experiments of 2D autonomous navigation tasks in complex unknown environments, demonstrating its superiority in guiding the robot safely towards its desired goal while avoiding obstacles and escaping entrapment in local minima. The GPU implementation of GP-MPPI, including the supplementary video, is available at https://github.com/IhabMohamed/GP-MPPI.

翻译：在未知、杂乱且传感能力有限的环境中进行机器人导航面临重大挑战。局部轨迹优化方法，例如模型预测路径积分（MPPI），是应对这一挑战的有前景的解决方案。然而，全局引导是确保有效导航的必要条件，尤其在遇到挑战性环境条件或超出规划视界导航时。本研究提出GP-MPPI，这是一种基于在线学习的控制策略，将MPPI与基于稀疏高斯过程（SGP）的局部感知模型相结合。其核心思想是利用SGP的学习能力构建方差（不确定性）曲面，使机器人能够学习其周围的可导航空间，识别一组建议的子目标，并最终推荐一个最小化预定义代价函数的最优子目标给局部MPPI规划器。随后，MPPI计算满足机器人约束和避障约束的最优控制序列。这种方法消除了对环境全局地图或离线训练过程的需求。我们通过复杂未知环境中的2D自主导航任务的仿真和实际实验验证了所提控制策略的效率和鲁棒性，展示了其在引导机器人安全到达目标、避开障碍物并逃离局部极小值方面的优越性。GP-MPPI的GPU实现（包括补充视频）可在 https://github.com/IhabMohamed/GP-MPPI 获取。