Existing music-driven 3D dance generation methods mainly concentrate on high-quality dance generation, but lack sufficient control during the generation process. To address these issues, we propose a unified framework capable of generating high-quality dance movements and supporting multi-modal control, including genre control, semantic control, and spatial control. First, we decouple the dance generation network from the dance control network, thereby avoiding the degradation in dance quality when adding additional control information. Second, we design specific control strategies for different control information and integrate them into a unified framework. Experimental results show that the proposed dance generation framework outperforms state-of-the-art methods in terms of motion quality and controllability.
翻译:现有的音乐驱动三维舞蹈生成方法主要关注高质量舞蹈的生成,但在生成过程中缺乏足够的控制能力。为解决这一问题,我们提出了一种统一框架,既能生成高质量舞蹈动作,又支持多模态控制,包括风格控制、语义控制和空间控制。首先,我们将舞蹈生成网络与舞蹈控制网络解耦,从而避免在添加额外控制信息时导致舞蹈质量下降。其次,我们针对不同控制信息设计了特定的控制策略,并将其整合到统一框架中。实验结果表明,所提出的舞蹈生成框架在动作质量和可控性方面均优于现有先进方法。