We present a novel approach to enhance the performance of sampling-based Model Predictive Control (MPC) in constrained optimization by leveraging products of experts. Our methodology divides the main problem into two components: one focused on optimality and the other on feasibility. By combining the solutions from each component, represented as distributions, we apply products of experts to implement a project-then-sample strategy. In this strategy, the optimality distribution is projected into the feasible area, allowing for more efficient sampling. This approach contrasts with the traditional sample-then-project and naive sample-then-reject method, leading to more diverse exploration and reducing the accumulation of samples on the boundaries. We demonstrate an effective implementation of this principle using a tensor train-based distribution model, which is characterized by its non-parametric nature, ease of combination with other distributions at the task level, and straightforward sampling technique. We adapt existing tensor train models to suit this purpose and validate the efficacy of our approach through experiments in various tasks, including obstacle avoidance, non-prehensile manipulation, and tasks involving staying in a restricted volume. Our experimental results demonstrate that the proposed method consistently outperforms known baselines, providing strong empirical support for its effectiveness. Sample codes for this project are available at https://github.com/idiap/smpc_poe.
翻译:本文提出了一种新颖方法,通过利用专家乘积来提升基于采样的模型预测控制在约束优化中的性能。我们的方法将主问题分解为两个组成部分:一个专注于最优性,另一个专注于可行性。通过结合表示为分布的每个组分的解,我们应用专家乘积来实现“先投影后采样”策略。在该策略中,最优性分布被投影到可行区域,从而实现更高效的采样。这种方法与传统的“先采样后投影”及朴素的“先采样后拒绝”方法形成对比,能够实现更多样化的探索并减少边界上的样本积累。我们使用基于张量链的分布模型有效实现了这一原理,该模型具有非参数特性、易于在任务层面与其他分布结合以及采样技术简单的特点。我们调整了现有的张量链模型以适应此目的,并通过在避障、非抓取式操作以及限制区域内停留等多种任务中的实验验证了所提方法的有效性。实验结果表明,所提方法在各项任务中均持续优于已知基线,为其有效性提供了有力的实证支持。本项目示例代码发布于 https://github.com/idiap/smpc_poe。