This paper presents a novel Stochastic Optimal Control (SOC) method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions. While MPPI can find a Gaussian-approximated optimal action distribution in closed form, i.e., without iterative solution updates, it struggles with multimodality of the optimal distributions, such as those involving non-convex constraints for obstacle avoidance. This is due to the less representative nature of the Gaussian. To overcome this limitation, our method aims to identify a target mode of the optimal distribution and guide the solution to converge to fit it. In the proposed method, the target mode is roughly estimated using a modified Stein Variational Gradient Descent (SVGD) method and embedded into the MPPI algorithm to find a closed-form "mode-seeking" solution that covers only the target mode, thus preserving the fast convergence property of MPPI. Our simulation and real-world experimental results demonstrate that SVG-MPPI outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities. Source code: https://github.com/kohonda/proj-svg_mppi
翻译:本文提出一种基于模型预测路径积分控制(MPPI)的新型随机最优控制(SOC)方法,命名为斯坦变分引导MPPI(SVG-MPPI),旨在处理快速变化的混合最优动作分布。虽然MPPI能以闭式解(即无需迭代更新)形式求得高斯近似的最优动作分布,但其在处理涉及非凸避障约束等最优分布的混合模态特性时存在局限,这源于高斯分布表征能力的不足。为突破这一限制,本方法旨在识别最优分布的目标模态,并引导解收敛至该模态。具体而言,首先利用改进的斯坦变分梯度下降(SVGD)方法粗略估计目标模态,随后将其嵌入MPPI算法,构建仅覆盖目标模态的闭式“模态搜索”解,从而保持MPPI的快速收敛特性。仿真与真实实验结果表明,SVG-MPPI在路径跟踪与避障能力上均优于原始MPPI及其他基于采样的先进SOC算法。源代码:https://github.com/kohonda/proj-svg_mppi