This paper presents a novel approach to generating the 3D motion of a human interacting with a target object, with a focus on solving the challenge of synthesizing long-range and diverse motions, which could not be fulfilled by existing auto-regressive models or path planning-based methods. We propose a hierarchical generation framework to solve this challenge. Specifically, our framework first generates a set of milestones and then synthesizes the motion along them. Therefore, the long-range motion generation could be reduced to synthesizing several short motion sequences guided by milestones. The experiments on the NSM, COUCH, and SAMP datasets show that our approach outperforms previous methods by a large margin in both quality and diversity. The source code is available on our project page https://zju3dv.github.io/hghoi.
翻译:本文提出了一种新颖的方法来生成人与目标物体交互的三维运动,重点解决了现有自回归模型或基于路径规划的方法无法完成的长时间范围且多样化动作合成的挑战。我们提出了一个层次化生成框架来应对这一挑战。具体而言,该框架首先生成一组里程碑点,然后沿这些点合成运动。因此,长时间范围的动作生成可简化为多个由里程碑引导的短时运动序列合成。在NSM、COUCH和SAMP数据集上的实验表明,我们的方法在质量和多样性上均大幅超越以往方法。源代码已发布在项目主页 https://zju3dv.github.io/hghoi。