We present a novel data-driven framework for unsupervised human motion retargeting which animates a target body shape with a source motion. This allows to retarget motions between different characters by animating a target subject with a motion of a source subject. Our method is correspondence-free,~\ie neither spatial correspondences between the source and target shapes nor temporal correspondences between different frames of the source motion are required. Our proposed method directly animates a target shape with arbitrary sequences of humans in motion, possibly captured using 4D acquisition platforms or consumer devices. Our framework takes into account long-term temporal context of $1$ second during retargeting while accounting for surface details. To achieve this, we take inspiration from two lines of existing work: skeletal motion retargeting, which leverages long-term temporal context at the cost of surface detail, and surface-based retargeting, which preserves surface details without considering long-term temporal context. We unify the advantages of these works by combining a learnt skinning field with a skeletal retargeting approach. During inference, our method runs online,~\ie the input can be processed in a serial way, and retargeting is performed in a single forward pass per frame. Experiments show that including long-term temporal context during training improves the method's accuracy both in terms of the retargeted skeletal motion and the detail preservation. Furthermore, our method generalizes well on unobserved motions and body shapes. We demonstrate that the proposed framework achieves state-of-the-art results on two test datasets.
翻译:我们提出了一种新颖的无监督人体运动重定向数据驱动框架,该方法能够将源运动动作赋予目标身体形状。这允许通过用源对象的动作驱动目标对象,在不同角色间进行运动重定向。我们的方法无需对应关系,即不需要源形状与目标形状之间的空间对应,也不需要源运动不同帧之间的时间对应。所提方法可直接用任意人体运动序列(可能通过4D采集平台或消费级设备捕捉)驱动目标形状。该框架在重定向过程中考虑了1秒的长期时间上下文,同时保留表面细节。为实现这一目标,我们借鉴了两类现有工作:骨骼运动重定向(以牺牲表面细节为代价利用长期时间上下文)和基于表面的重定向(保留表面细节却未考虑长期时间上下文)。通过将学习到的蒙皮场与骨骼重定向方法相结合,我们统一了这两类方法的优势。在推理阶段,本方法可在线运行(即输入可按序列方式处理),且每帧重定向仅需一次前向传播。实验表明,训练中纳入长期时间上下文可提升方法在重定向骨骼运动与细节保留两方面的精度。此外,本方法在未见过的运动与身体形状上具有良好的泛化能力。我们在两个测试数据集上证明,所提框架达到了最优性能。