Diffusion-based visuomotor policies excel at learning complex robotic tasks by effectively combining visual data with high-dimensional, multi-modal action distributions. However, diffusion models often suffer from slow inference due to costly denoising processes or require complex sequential training arising from recent distilling approaches. This paper introduces Riemannian Flow Matching Policy (RFMP), a model that inherits the easy training and fast inference capabilities of flow matching (FM). Moreover, RFMP inherently incorporates geometric constraints commonly found in realistic robotic applications, as the robot state resides on a Riemannian manifold. To enhance the robustness of RFMP, we propose Stable RFMP (SRFMP), which leverages LaSalle's invariance principle to equip the dynamics of FM with stability to the support of a target Riemannian distribution. Rigorous evaluation on eight simulated and real-world tasks show that RFMP successfully learns and synthesizes complex sensorimotor policies on Euclidean and Riemannian spaces with efficient training and inference phases, outperforming Diffusion Policies while remaining competitive with Consistency Policies.
翻译:基于扩散模型的视觉运动策略通过有效结合视觉数据与高维多模态动作分布,在复杂机器人任务学习中表现卓越。然而,扩散模型常因昂贵的去噪过程导致推理速度缓慢,或需采用近期蒸馏方法带来的复杂序列训练。本文提出黎曼流匹配策略(RFMP),该模型继承了流匹配(FM)易于训练与快速推理的特性。此外,由于机器人状态存在于黎曼流形上,RFMP天然地融入了现实机器人应用中常见的几何约束。为增强RFMP的鲁棒性,我们提出稳定RFMP(SRFMP),其利用LaSalle不变性原理,使FM动力学具备对目标黎曼分布支撑集的稳定性。在八项仿真与真实世界任务上的严格评估表明,RFMP能通过高效的训练与推理阶段,成功在欧几里得空间与黎曼空间上学习并合成复杂的感知运动策略,其性能超越扩散策略,并与一致性策略保持竞争力。