This paper presents a novel Stochastic Optimal Control (SOC) method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions. While MPPI can find a Gaussian-approximated optimal action distribution in closed form, i.e., without iterative solution updates, it struggles with the multimodality of the optimal distributions. This is due to the less representative nature of the Gaussian. To overcome this limitation, our method aims to identify a target mode of the optimal distribution and guide the solution to converge to fit it. In the proposed method, the target mode is roughly estimated using a modified Stein Variational Gradient Descent (SVGD) method and embedded into the MPPI algorithm to find a closed-form "mode-seeking" solution that covers only the target mode, thus preserving the fast convergence property of MPPI. Our simulation and real-world experimental results demonstrate that SVG-MPPI outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities. Source code: https://github.com/kohonda/proj-svg_mppi
翻译:本文提出了一种基于模型预测路径积分控制(MPPI)的新型随机最优控制(SOC)方法,名为Stein变分引导MPPI(SVG-MPPI),旨在处理快速变化的模态最优动作分布。尽管MPPI能以闭式求解高斯近似最优动作分布(即无需迭代更新解),但其在处理最优分布的多模态性时存在困难,这源于高斯分布代表性不足的局限性。为克服这一局限,本文方法旨在识别最优分布的目标模态,并引导解收敛以拟合该模态。在提出的方法中,通过改进的Stein变分梯度下降(SVGD)方法粗略估计目标模态,并将其嵌入MPPI算法中,以寻找仅覆盖目标模态的闭式"模式搜索"解,从而保持MPPI的快速收敛特性。仿真与真实世界实验结果表明,SVG-MPPI在路径跟踪与避障能力上均优于原始MPPI及其他基于采样的先进SOC算法。源代码:https://github.com/kohonda/proj-svg_mppi