Sampling-based controllers, such as Model Predictive Path Integral (MPPI) methods, offer substantial flexibility but often suffer from high variance and low sample efficiency. To address these challenges, we introduce a hybrid variance-reduced MPPI framework that integrates a prior model into the sampling process. Our key insight is to decompose the objective function into a known approximate model and a residual term. Since the residual captures only the discrepancy between the model and the objective, it typically exhibits a smaller magnitude and lower variance than the original objective. Although this principle applies to general modeling choices, we demonstrate that adopting a quadratic approximation enables the derivation of a closed-form, model-guided prior that effectively concentrates samples in informative regions. Crucially, the framework is agnostic to the source of geometric information, allowing the quadratic model to be constructed from exact derivatives, structural approximations (e.g., Gauss- or Quasi-Newton), or gradient-free randomized smoothing. We validate the approach on standard optimization benchmarks, a nonlinear, underactuated cart-pole control task, and a contact-rich manipulation problem with non-smooth dynamics. Across these domains, we achieve faster convergence and superior performance in low-sample regimes compared to standard MPPI. These results suggest that the method can make sample-based control strategies more practical in scenarios where obtaining samples is expensive or limited.
翻译:基于采样的控制器(如模型预测路径积分方法)具有显著的灵活性,但通常存在方差高、样本效率低的问题。为应对这些挑战,我们提出了一种混合方差缩减MPPI框架,该框架将先验模型集成到采样过程中。我们的核心思路是将目标函数分解为一个已知的近似模型和一个残差项。由于残差仅捕获模型与目标之间的差异,其通常比原始目标具有更小的幅值和更低的方差。尽管该原理适用于一般的建模选择,但我们证明采用二次逼近能够推导出闭式、模型引导的先验分布,从而有效地将样本集中在信息丰富的区域。关键在于,该框架对几何信息的来源保持不可知性,允许二次模型通过精确导数、结构近似(如高斯-牛顿或拟牛顿法)或无梯度随机平滑等方法构建。我们在标准优化基准测试、非线性欠驱动倒立摆控制任务以及具有非光滑动力学的接触丰富操作问题上验证了该方法。在这些领域中,与标准MPPI相比,我们在低样本区域实现了更快的收敛速度和更优的性能。这些结果表明,在样本获取成本高昂或受限的场景中,该方法能使基于采样的控制策略更具实用性。