Avatars are important to create interactive and immersive experiences in virtual worlds. One challenge in animating these characters to mimic a user's motion is that commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose. Another challenge is that an avatar might have a different skeleton structure than a human and the mapping between them is unclear. In this work we address both of these challenges. We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies. Our method uses reinforcement learning to train a policy to control characters in a physics simulator. We only require human motion capture data for training, without relying on artist-generated animations for each avatar. This allows us to use large motion capture datasets to train general policies that can track unseen users from real and sparse data in real-time. We demonstrate the feasibility of our approach on three characters with different skeleton structure: a dinosaur, a mouse-like creature and a human. We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available. We discuss and ablate the important components in our framework, specifically the kinematic retargeting step, the imitation, contact and action reward as well as our asymmetric actor-critic observations. We further explore the robustness of our method in a variety of settings including unbalancing, dancing and sports motions.
翻译:虚拟形象在创造交互式沉浸式虚拟世界体验中至关重要。当前商用AR/VR设备仅包含头显和控制器,提供极其有限的用户姿态传感器数据,这使得驱动虚拟角色模仿用户动作面临挑战。另一个挑战是虚拟角色可能具有与人类不同的骨骼结构,两者之间的映射关系尚不明确。本研究同时解决了这两个问题。我们提出了一种方法,可从稀疏的人体传感器数据实时重定向动作至不同形态的角色。该方法采用强化学习训练策略,控制物理模拟器中的角色。训练仅需人体运动捕捉数据,无需为每个虚拟角色人工生成动画。这使得我们可以利用大规模运动数据集训练通用策略,实时追踪真实稀疏数据驱动的未见用户。我们通过恐龙、类鼠生物和人类三种不同骨骼结构的角色验证了方法的可行性。结果表明,尽管缺乏下半身传感器信息,虚拟角色的姿态与真实用户的匹配度仍令人惊喜。我们讨论并消融了框架中的关键组件,包括运动学重定向步骤、模仿奖励、接触奖励和动作奖励,以及非对称的Actor-Critic观测机制。进一步在平衡、舞蹈和运动等多种场景中验证了方法的鲁棒性。