The task of action-driven human motion prediction aims to forecast future human motion based on the observed sequence while respecting the given action label. It requires modeling not only the stochasticity within human motion but the smooth yet realistic transition between multiple action labels. However, the fact that most datasets do not contain such transition data complicates this task. Existing work tackles this issue by learning a smoothness prior to simply promote smooth transitions, yet doing so can result in unnatural transitions especially when the history and predicted motions differ significantly in orientations. In this paper, we argue that valid human motion transitions should incorporate realistic leg movements to handle orientation changes, and cast it as an action-conditioned in-betweening (ACB) learning task to encourage transition naturalness. Because modeling all possible transitions is virtually unreasonable, our ACB is only performed on very few selected action classes with active gait motions, such as Walk or Run. Specifically, we follow a two-stage forecasting strategy by first employing the motion diffusion model to generate the target motion with a specified future action, and then producing the in-betweening to smoothly connect the observation and prediction to eventually address motion prediction. Our method is completely free from the labeled motion transition data during training. To show the robustness of our approach, we generalize our trained in-betweening learning model on one dataset to two unseen large-scale motion datasets to produce natural transitions. Extensive experimental evaluations on three benchmark datasets demonstrate that our method yields the state-of-the-art performance in terms of visual quality, prediction accuracy, and action faithfulness.
翻译:动作驱动的人体运动预测任务旨在根据观测序列,在遵循给定动作标签的同时预测未来人体运动。该任务不仅需要建模人体运动中的随机性,还需刻画多个动作标签之间的平滑且真实的过渡。然而,大多数数据集不包含此类过渡数据,这使任务复杂化。现有研究通过学习平滑先验来简单促进过渡平滑,但可能导致不自然的过渡,尤其在历史运动与预测运动的朝向差异显著时。本文认为,有效的人体运动过渡应结合真实的腿部运动以处理朝向变化,并将其建模为动作条件插值(ACB)学习任务以提升过渡自然性。由于建模所有可能的过渡实际上不合理,我们的ACB仅针对少数具有主动步态动作(如行走、奔跑)的动作类别实施。具体而言,我们采用两阶段预测策略:首先利用运动扩散模型生成带有指定未来动作的目标运动,随后生成插值运动以平滑连接观测与预测,最终解决运动预测问题。我们的方法在训练过程中完全无需标注的运动过渡数据。为展示方法的鲁棒性,我们将一个数据集上训练的插值学习模型泛化至两个未见的大规模运动数据集以生成自然过渡。在三个基准数据集上的广泛实验评估表明,本方法在视觉质量、预测准确性和动作忠实度方面均达到最先进性能。