The wide application of flow-matching methods has greatly promoted the development of robot imitation learning. However, these methods all face the problem of high inference time. To address this issue, researchers have proposed distillation methods and consistency methods, but the performance of these methods still struggles to compete with that of the original diffusion models and flow-matching models. In this article, we propose a one-step shortcut method with multi-step integration for robot imitation learning. To balance the inference speed and performance, we extend the multi-step consistency loss on the basis of the shortcut model, split the one-step loss into multi-step losses, and improve the performance of one-step inference. Secondly, to solve the problem of unstable optimization of the multi-step loss and the original flow-matching loss, we propose an adaptive gradient allocation method to enhance the stability of the learning process. Finally, we evaluate the proposed method in two simulation benchmarks and five real-world environment tasks. The experimental results verify the effectiveness of the proposed algorithm.
翻译:流匹配方法的广泛应用极大地推动了机器人模仿学习的发展。然而,这些方法都面临着推理时间过长的问题。为解决这一问题,研究者提出了蒸馏方法和一致性方法,但这些方法的性能仍难以与原始扩散模型和流匹配模型相媲美。本文提出了一种用于机器人模仿学习的多步积分一步捷径方法。为平衡推理速度与性能,我们在捷径模型的基础上扩展了多步一致性损失,将一步损失拆分为多步损失,从而提升了一步推理的性能。其次,针对多步损失与原始流匹配损失优化不稳定的问题,我们提出了一种自适应梯度分配方法以增强学习过程的稳定性。最后,我们在两个仿真基准测试和五个真实环境任务中对所提方法进行了评估。实验结果验证了所提算法的有效性。