Human motion stylization aims to revise the style of an input motion while keeping its content unaltered. Unlike existing works that operate directly in pose space, we leverage the latent space of pretrained autoencoders as a more expressive and robust representation for motion extraction and infusion. Building upon this, we present a novel generative model that produces diverse stylization results of a single motion (latent) code. During training, a motion code is decomposed into two coding components: a deterministic content code, and a probabilistic style code adhering to a prior distribution; then a generator massages the random combination of content and style codes to reconstruct the corresponding motion codes. Our approach is versatile, allowing the learning of probabilistic style space from either style labeled or unlabeled motions, providing notable flexibility in stylization as well. In inference, users can opt to stylize a motion using style cues from a reference motion or a label. Even in the absence of explicit style input, our model facilitates novel re-stylization by sampling from the unconditional style prior distribution. Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-arts in style reeanactment, content preservation, and generalization across various applications and settings. Project Page: https://yxmu.foo/GenMoStyle
翻译:人体运动风格化旨在修改输入运动的风格,同时保留其内容不变。与直接在姿态空间操作的现有工作不同,我们利用预训练自编码器的潜在空间作为更具表现力和鲁棒性的运动提取与注入表征。基于此,我们提出了一种新型生成模型,可对单一运动(潜在)编码产生多样化的风格化结果。训练过程中,运动编码被分解为两个编码组件:确定性的内容编码和遵循先验分布的概率性风格编码;随后生成器对内容编码与风格编码的随机组合进行处理,以重建对应的运动编码。我们的方法具有通用性,允许从带风格标签或无标签的运动中学习概率性风格空间,从而在风格化中提供显著的灵活性。推理阶段,用户可选择使用参考运动的风格线索或标签对运动进行风格化。即便在缺乏显式风格输入的情况下,模型也可通过从无条件风格先验分布中采样实现新颖的重风格化。实验结果表明,我们提出的轻量级风格化模型在风格重演、内容保持及跨多种应用场景的泛化性能上均超越了现有最先进方法。项目地址:https://yxmu.foo/GenMoStyle