Sign Language Production (SLP) aims to generate semantically consistent sign videos from textual statements, where the conversion from textual glosses to sign poses (G2P) is a crucial step. Existing G2P methods typically treat sign poses as discrete three-dimensional coordinates and directly fit them, which overlooks the relative positional relationships among joints. To this end, we provide a new perspective, constraining joint associations and gesture details by modeling the limb bones to improve the accuracy and naturalness of the generated poses. In this work, we propose a pioneering iconicity disentangled diffusion framework, termed Sign-IDD, specifically designed for SLP. Sign-IDD incorporates a novel Iconicity Disentanglement (ID) module to bridge the gap between relative positions among joints. The ID module disentangles the conventional 3D joint representation into a 4D bone representation, comprising the 3D spatial direction vector and 1D spatial distance vector between adjacent joints. Additionally, an Attribute Controllable Diffusion (ACD) module is introduced to further constrain joint associations, in which the attribute separation layer aims to separate the bone direction and length attributes, and the attribute control layer is designed to guide the pose generation by leveraging the above attributes. The ACD module utilizes the gloss embeddings as semantic conditions and finally generates sign poses from noise embeddings. Extensive experiments on PHOENIX14T and USTC-CSL datasets validate the effectiveness of our method. The code is available at: https://github.com/NaVi-start/Sign-IDD.
翻译:手语生成(SLP)旨在从文本陈述中生成语义一致的手语视频,其中从文本语素到手语姿态的转换(G2P)是关键步骤。现有的G2P方法通常将手语姿态视为离散的三维坐标并直接拟合,忽略了关节间的相对位置关系。为此,我们提出一种新视角,通过对肢体骨骼建模来约束关节关联与手势细节,从而提升生成姿态的准确性与自然度。本文提出了一种开创性的象似性解耦扩散框架,命名为Sign-IDD,专为SLP任务设计。Sign-IDD引入新颖的象似性解耦(ID)模块,以弥合关节间相对位置的建模鸿沟。该模块将传统的3D关节表示解耦为4D骨骼表示,包含相邻关节间的3D空间方向向量与1D空间距离向量。此外,我们设计了属性可控扩散(ACD)模块以进一步约束关节关联,其中属性分离层旨在分离骨骼方向与长度属性,属性控制层则利用上述属性引导姿态生成。ACD模块以语素嵌入作为语义条件,最终从噪声嵌入中生成手语姿态。在PHOENIX14T和USTC-CSL数据集上的大量实验验证了本方法的有效性。代码公开于:https://github.com/NaVi-start/Sign-IDD。