Synthesizing interaction-involved human motions has been challenging due to the high complexity of 3D environments and the diversity of possible human behaviors within. We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments. The key motivation of LAMA is to build a unified framework to encompass a series of everyday motions including locomotion, scene interaction, and object manipulation. Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis. LAMA leverages a reinforcement learning framework coupled with a motion matching algorithm for optimization, and further exploits a motion editing framework via manifold learning to cover possible variations in interaction and manipulation. Throughout extensive experiments, we demonstrate that LAMA outperforms previous approaches in synthesizing realistic motions in various challenging scenarios. Project page: https://jiyewise.github.io/projects/LAMA/ .
翻译:合成涉及交互的人体运动一直具有挑战性,原因在于3D环境的高度复杂性以及其中可能的人类行为的多样性。我们提出了LAMA(运动-动作-操作),用于在复杂室内环境中合成自然且可信的长期人体运动。LAMA的关键动机是构建一个统一框架,以涵盖一系列日常运动,包括移动、场景交互和物体操作。与现有方法需要与扫描的3D场景“配对”的运动数据来进行监督不同,我们将该问题表述为一种测试时优化,仅使用人体运动捕捉数据进行合成。LAMA利用强化学习框架结合运动匹配算法进行优化,并通过流形学习利用运动编辑框架来覆盖交互和操作中可能的变化。通过大量实验,我们展示了LAMA在合成各种具有挑战性场景下的逼真运动方面优于先前方法。项目页面:https://jiyewise.github.io/projects/LAMA/ 。