Generating 3D dances from music is an emerged research task that benefits a lot of applications in vision and graphics. Previous works treat this task as sequence generation, however, it is challenging to render a music-aligned long-term sequence with high kinematic complexity and coherent movements. In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction, where the key poses are easier to be synchronized with the music beats and the parametric curves can be efficiently regressed to render fluent rhythm-aligned movements. We named the proposed method as DanceFormer, which includes two cascading kinematics-enhanced transformer-guided networks (called DanTrans) that tackle each stage, respectively. Furthermore, we propose a large-scale music conditioned 3D dance dataset, called PhantomDance, that is accurately labeled by experienced animators rather than reconstruction or motion capture. This dataset also encodes dances as key poses and parametric motion curves apart from pose sequences, thus benefiting the training of our DanceFormer. Extensive experiments demonstrate that the proposed method, even trained by existing datasets, can generate fluent, performative, and music-matched 3D dances that surpass previous works quantitatively and qualitatively. Moreover, the proposed DanceFormer, together with the PhantomDance dataset (https://github.com/libuyu/PhantomDanceDataset), are seamlessly compatible with industrial animation software, thus facilitating the adaptation for various downstream applications.
翻译:从音乐生成三维舞蹈是一项新兴的研究任务,对视觉与图形学领域的众多应用具有重要意义。现有方法将这一任务视为序列生成问题,然而生成与音乐对齐、具有高运动学复杂度且动作连贯的长期序列仍具挑战性。本文通过两阶段流程重新定义该问题:先生成关键姿态,再预测中间参数化运动曲线。其中关键姿态更易与音乐节拍同步,参数化曲线则可通过高效回归实现流畅的节奏对齐动作。我们提出的方法命名为DanceFormer,包含两个级联的运动学增强型Transformer引导网络(称为DanTrans)分别处理各阶段。此外,我们构建了大规模音乐条件三维舞蹈数据集PhantomDance,该数据集由经验丰富的动画师精确标注,而非通过重建或动作捕捉获得。该数据集除姿态序列外,还将舞蹈编码为关键姿态与参数化运动曲线,从而有益于DanceFormer的训练。大量实验表明,即使使用现有数据集训练,该方法生成的舞蹈在流畅性、表现力及音乐匹配度上均定量与定性优于先前工作。同时,提出的DanceFormer与PhantomDance数据集(https://github.com/libuyu/PhantomDanceDataset)可无缝兼容工业动画软件,便于适配各类下游应用。