In this paper, we address the challenge of generating realistic 3D human motions for action classes that were never seen during the training phase. Our approach involves decomposing complex actions into simpler movements, specifically those observed during training, by leveraging the knowledge of human motion contained in GPTs models. These simpler movements are then combined into a single, realistic animation using the properties of diffusion models. Our claim is that this decomposition and subsequent recombination of simple movements can synthesize an animation that accurately represents the complex input action. This method operates during the inference phase and can be integrated with any pre-trained diffusion model, enabling the synthesis of motion classes not present in the training data. We evaluate our method by dividing two benchmark human motion datasets into basic and complex actions, and then compare its performance against the state-of-the-art.
翻译:本文旨在解决为训练阶段未见过的动作类别生成逼真三维人体运动这一挑战。我们的方法通过利用GPTs模型中蕴含的人体运动知识,将复杂动作分解为训练中观察到的更简单运动单元。随后,借助扩散模型的特性将这些简单运动组合成单一且逼真的动画。我们主张,通过对简单运动的这种分解与重组,能够合成精确表征复杂输入动作的动画。该方法在推理阶段运行,可与任何预训练的扩散模型集成,从而实现对训练数据中未包含运动类别的合成。我们通过将两个基准人体运动数据集划分为基础动作与复杂动作来评估本方法,并与当前最优技术进行性能比较。