Animating human-scene interactions such as pick-and-place tasks in cluttered, complex layouts is a challenging task, with objects of a wide variation of geometries and articulation under scenarios with various obstacles. The main difficulty lies in the sparsity of the motion data compared to the wide variation of the objects and environments as well as the poor availability of transition motions between different tasks, increasing the complexity of the generalization to arbitrary conditions. To cope with this issue, we develop a system that tackles the interaction synthesis problem as a hierarchical goal-driven task. Firstly, we develop a bimanual scheduler that plans a set of keyframes for simultaneously controlling the two hands to efficiently achieve the pick-and-place task from an abstract goal signal such as the target object selected by the user. Next, we develop a neural implicit planner that generates guidance hand trajectories under diverse object shape/types and obstacle layouts. Finally, we propose a linear dynamic model for our DeepPhase controller that incorporates a Kalman filter to enable smooth transitions in the frequency domain, resulting in a more realistic and effective multi-objective control of the character.Our system can produce a wide range of natural pick-and-place movements with respect to the geometry of objects, the articulation of containers and the layout of the objects in the scene.
翻译:在杂乱复杂的场景布局中生成拾放任务等人-物交互动画是一项具有挑战性的任务,这涉及几何形状差异巨大的物体、多关节结构以及存在各种障碍物的场景。主要困难在于:相较于物体与环境的广泛多样性,运动数据十分稀疏;不同任务间过渡动作的可用性较差,这增加了泛化至任意条件的复杂性。为解决该问题,我们开发了一个将交互合成问题构建为分层目标驱动任务的系统。首先,我们设计了一个双手调度器,该模块通过规划关键帧序列来同步控制双手,从而根据用户选择目标物体等抽象目标信号高效完成拾放任务。其次,我们开发了神经隐式规划器,能够在不同物体形状/类型及障碍物布局下生成引导性手部轨迹。最后,我们为DeepPhase控制器提出了一种结合卡尔曼滤波器的线性动态模型,通过在频域实现平滑过渡,从而实现对角色更逼真有效的多目标控制。我们的系统能够针对物体几何形状、容器关节结构及场景中物体布局,生成大量自然的拾放动作。