Synthesizing physically plausible human motions in 3D scenes is a challenging problem. Kinematics-based methods cannot avoid inherent artifacts (e.g., penetration and foot skating) due to the lack of physical constraints. Meanwhile, existing physics-based methods cannot generalize to multi-object scenarios since the policy trained with reinforcement learning has limited modeling capacity. In this work, we present a framework that enables physically simulated characters to perform long-term interaction tasks in diverse, cluttered, and unseen scenes. The key idea is to decompose human-scene interactions into two fundamental processes, Interacting and Navigating, which motivates us to construct two reusable Controller, i.e., InterCon and NavCon. Specifically, InterCon contains two complementary policies that enable characters to enter and leave the interacting state (e.g., sitting on a chair and getting up). To generate interaction with objects at different places, we further design NavCon, a trajectory following policy, to keep characters' locomotion in the free space of 3D scenes. Benefiting from the divide and conquer strategy, we can train the policies in simple environments and generalize to complex multi-object scenes. Experimental results demonstrate that our framework can synthesize physically plausible long-term human motions in complex 3D scenes. Code will be publicly released at https://github.com/liangpan99/InterScene.
翻译:在3D场景中合成物理合理的角色动作是一个具有挑战性的问题。基于运动学的方法由于缺乏物理约束,无法避免固有伪影(如穿透和脚部滑动)。同时,现有的基于物理的方法无法泛化到多物体场景,因为通过强化学习训练的策略建模能力有限。本文提出一个框架,使物理模拟的角色能够在多样化、杂乱且未见过的场景中执行长期交互任务。关键思想是将人-场景交互分解为两个基本过程:交互(Interacting)与导航(Navigating),据此我们构建了两个可复用的控制器,即InterCon和NavCon。具体而言,InterCon包含两个互补策略,使角色能够进入和离开交互状态(例如坐在椅子上或起身)。为生成与不同位置物体的交互,我们进一步设计了轨迹跟踪策略NavCon,以保持角色在3D场景自由空间中的运动能力。得益于分而治之策略,我们可以在简单环境下训练策略,并泛化到复杂多物体场景。实验结果表明,我们的框架能够在复杂3D场景中合成物理合理的长期角色动作。代码将公开发布于https://github.com/liangpan99/InterScene。