Dance plays an important role as an artistic form and expression in human culture, yet automatically generating dance sequences is a significant yet challenging endeavor. Existing approaches often neglect the critical aspect of controllability in dance generation. Additionally, they inadequately model the nuanced impact of music styles, resulting in dances that lack alignment with the expressive characteristics inherent in the conditioned music. To address this gap, we propose Style-Guided Motion Diffusion (SGMD), which integrates the Transformer-based architecture with a Style Modulation module. By incorporating music features with user-provided style prompts, the SGMD ensures that the generated dances not only match the musical content but also reflect the desired stylistic characteristics. To enable flexible control over the generated dances, we introduce a spatial-temporal masking mechanism. As controllable dance generation has not been fully studied, we construct corresponding experimental setups and benchmarks for tasks such as trajectory-based dance generation, dance in-betweening, and dance inpainting. Extensive experiments demonstrate that our approach can generate realistic and stylistically consistent dances, while also empowering users to create dances tailored to diverse artistic and practical needs. Code is available on Github: https://github.com/mucunzhuzhu/DGSDP
翻译:舞蹈作为人类文化中重要的艺术形式和表达方式,其自动生成序列是一项重要且具有挑战性的任务。现有方法往往忽视了舞蹈生成中可控性这一关键方面。此外,它们未能充分建模音乐风格对舞蹈动作的细微影响,导致生成的舞蹈与所依据音乐的表达特性缺乏一致性。为弥补这一不足,我们提出了风格引导的运动扩散模型,该模型将基于Transformer的架构与风格调制模块相结合。通过融合音乐特征与用户提供的风格提示,SGMD确保生成的舞蹈不仅匹配音乐内容,同时反映期望的风格特征。为实现对生成舞蹈的灵活控制,我们引入了时空掩码机制。鉴于可控舞蹈生成尚未得到充分研究,我们为基于轨迹的舞蹈生成、舞蹈中间帧生成和舞蹈修复等任务构建了相应的实验设置与基准。大量实验表明,我们的方法能够生成逼真且风格一致的舞蹈,同时使用户能够根据多样化的艺术与实用需求定制舞蹈。代码已发布于Github:https://github.com/mucunzhuzhu/DGSDP