Developing text-based robot trajectory generation models is made particularly difficult by the small dataset size, high dimensionality of the trajectory space, and the inherent complexity of the text-conditional motion distribution. Recent manifold learning-based methods have partially addressed the dimensionality and dataset size issues, but struggle with the complex text-conditional distribution. In this paper we propose a text-based trajectory generation model that attempts to address all three challenges while relying on only a handful of demonstration trajectory data. Our key idea is to leverage recent flow-based models capable of capturing complex conditional distributions, not directly in the high-dimensional trajectory space, but rather in the low-dimensional latent coordinate space of the motion manifold, with deliberately designed regularization terms to ensure smoothness of motions and robustness to text variations. We show that our {\it Motion Manifold Flow Primitive (MMFP)} framework can accurately generate qualitatively distinct motions for a wide range of text inputs, significantly outperforming existing methods.
翻译:基于文本的机器人轨迹生成模型开发面临三大挑战:数据集规模小、轨迹空间维度高以及文本条件运动分布的固有复杂性。近期基于流形学习的方法部分解决了维度和数据集规模问题,但在处理复杂文本条件分布方面仍存在困难。本文提出一种基于文本的轨迹生成模型,该模型在仅依赖少量示范轨迹数据的前提下,尝试同时应对上述三个挑战。我们的核心思想是利用近期能够捕捉复杂条件分布的流模型,但并非直接在高维轨迹空间中操作,而是在运动流形的低维隐坐标空间中实现,并通过精心设计的正则化项确保运动平滑性和对文本变化的鲁棒性。实验表明,我们的{\it 运动流形流基元(MMFP)}框架能够针对广泛文本输入准确生成性质各异的运动轨迹,其性能显著优于现有方法。