Human motion synthesis is a fundamental task in computer animation. Recent methods based on diffusion models or GPT structure demonstrate commendable performance but exhibit drawbacks in terms of slow sampling speeds and error accumulation. In this paper, we propose \emph{Motion Flow Matching}, a novel generative model designed for human motion generation featuring efficient sampling and effectiveness in motion editing applications. Our method reduces the sampling complexity from thousand steps in previous diffusion models to just ten steps, while achieving comparable performance in text-to-motion and action-to-motion generation benchmarks. Noticeably, our approach establishes a new state-of-the-art Fr\'echet Inception Distance on the KIT-ML dataset. What is more, we tailor a straightforward motion editing paradigm named \emph{sampling trajectory rewriting} leveraging the ODE-style generative models and apply it to various editing scenarios including motion prediction, motion in-between prediction, motion interpolation, and upper-body editing. Our code will be released.
翻译:人体运动合成是计算机动画中的一项基础任务。近期基于扩散模型或GPT结构的方法展现了令人瞩目的性能,但存在采样速度慢和误差累积的问题。本文提出运动流匹配(Motion Flow Matching),一种专为人体运动生成设计的新型生成模型,具备高效采样能力且在运动编辑应用中效果显著。我们的方法将采样复杂度从先前扩散模型所需的数千步降至仅十步,同时在文本到运动与动作到运动生成的基准测试中达到可比性能。值得注意的是,我们的方法在KIT-ML数据集上建立了新的最优弗雷歇初始距离(Fr\'echet Inception Distance)。此外,我们利用常微分方程风格生成模型,定制了一种直观的运动编辑范式,即采样轨迹重写(sampling trajectory rewriting),并将其应用于包括运动预测、运动插补预测、运动插值及上半身编辑在内的多种编辑场景。我们的代码将对外发布。