Developing text-based robot trajectory generation models is made particularly difficult by the small dataset size, high dimensionality of the trajectory space, and the inherent complexity of the text-conditional motion distribution. Recent manifold learning-based methods have partially addressed the dimensionality and dataset size issues, but struggle with the complex text-conditional distribution. In this paper we propose a text-based trajectory generation model that attempts to address all three challenges while relying on only a handful of demonstration trajectory data. Our key idea is to leverage recent flow-based models capable of capturing complex conditional distributions, not directly in the high-dimensional trajectory space, but rather in the low-dimensional latent coordinate space of the motion manifold, with deliberately designed regularization terms to ensure smoothness of motions and robustness to text variations. We show that our Motion Manifold Flow Primitive (MMFP) framework can accurately generate qualitatively distinct motions for a wide range of text inputs, significantly outperforming existing methods.
翻译:基于文本的机器人轨迹生成模型开发面临三大挑战:数据集规模小、轨迹空间维度高以及文本条件运动分布固有的复杂性。近期基于流形学习的方法部分解决了维度和数据集规模问题,但在处理复杂文本条件分布方面仍存在困难。本文提出一种基于文本的轨迹生成模型,旨在同时应对上述三个挑战,且仅需少量演示轨迹数据。我们的核心思想是:利用近期能够捕捉复杂条件分布的流模型,不在高维轨迹空间直接建模,而是在运动流形的低维潜坐标空间中操作,并通过精心设计的正则化项确保运动平滑性和对文本变化的鲁棒性。实验表明,我们的运动流形流基元框架能够针对广泛文本输入准确生成具有定性差异的运动轨迹,性能显著优于现有方法。