Sampling-based model predictive control (MPC) has the potential for use in a wide variety of robotic systems. However, its unstable updates and poor convergence render it unsuitable for real-time control of robotic systems. This study addresses this challenge with a novel approach from reverse Kullback-Leibler divergence, which has a mode-seeking property and is likely to find one of the locally optimal solutions early. Using this approach, a weighted maximum likelihood estimation with positive and negative weights is obtained and solved using the mirror descent (MD) algorithm. Negative weights eliminate unnecessary actions, but a practical implementation needs to be designed to avoid interference with positive and negative updates based on rejection sampling. In addition, Nesterov's acceleration method for the proposed MD is modified to improve heuristic step size adaptive to the noise estimated in update amounts. Real-time simulations show that the proposed method can solve a wider variety of tasks statistically than the conventional method. In addition, higher degrees-of-freedom tasks can be solved by the improved acceleration even with a CPU only. The real-world applicability of the proposed method is also demonstrated by optimizing the operability in a variable impedance control of a force-driven mobile robot. https://youtu.be/D8bFMzct1XM
翻译:基于采样的模型预测控制(MPC)在多种机器人系统中具有广泛的应用潜力。然而,其更新过程的不稳定性和较差的收敛性使其难以适用于机器人系统的实时控制。本研究通过一种基于反向Kullback-Leibler散度的新方法应对这一挑战,该散度具有模态寻求特性,能够较早地找到局部最优解之一。利用该方法,我们得到了一个带有正负权重的加权最大似然估计,并通过镜像下降(MD)算法进行求解。负权重可消除不必要的动作,但在实际实现中需通过基于拒绝采样的设计来避免正负更新之间的相互干扰。此外,本研究对MD算法采用了改进的Nesterov加速方法,以提升启发式步长的自适应性,使其能够根据更新量中估计的噪声进行调整。实时仿真结果表明,与传统方法相比,所提方法在统计意义上能够解决更多样化的任务。同时,即使仅使用CPU,改进的加速方法也能处理更高自由度的任务。通过在一款力驱动机器人的可变阻抗控制中优化其可操作性,本研究也验证了所提方法在实际应用中的有效性。https://youtu.be/D8bFMzct1XM