HACK: Learning a Parametric Head and Neck Model for High-fidelity Animation

Significant advancements have been made in developing parametric models for digital humans, with various approaches concentrating on parts such as the human body, hand, or face. Nevertheless, connectors such as the neck have been overlooked in these models, with rich anatomical priors often unutilized. In this paper, we introduce HACK (Head-And-neCK), a novel parametric model for constructing the head and cervical region of digital humans. Our model seeks to disentangle the full spectrum of neck and larynx motions, facial expressions, and appearance variations, providing personalized and anatomically consistent controls, particularly for the neck regions. To build our HACK model, we acquire a comprehensive multi-modal dataset of the head and neck under various facial expressions. We employ a 3D ultrasound imaging scheme to extract the inner biomechanical structures, namely the precise 3D rotation information of the seven vertebrae of the cervical spine. We then adopt a multi-view photometric approach to capture the geometry and physically-based textures of diverse subjects, who exhibit a diverse range of static expressions as well as sequential head-and-neck movements. Using the multi-modal dataset, we train the parametric HACK model by separating the 3D head and neck depiction into various shape, pose, expression, and larynx blendshapes from the neutral expression and the rest skeletal pose. We adopt an anatomically-consistent skeletal design for the cervical region, and the expression is linked to facial action units for artist-friendly controls. HACK addresses the head and neck as a unified entity, offering more accurate and expressive controls, with a new level of realism, particularly for the neck regions. This approach has significant benefits for numerous applications and enables inter-correlation analysis between head and neck for fine-grained motion synthesis and transfer.

翻译：在数字人参数化模型构建方面已取得显著进展，各类方法分别聚焦于人体、手部或面部等部位。然而，颈部等连接部位在这些模型中常被忽视，丰富的解剖学先验知识也未能充分利用。本文提出新型参数化模型HACK（头颈模型），用于构建数字人的头部与颈椎区域。该模型旨在解耦颈部与喉部运动、面部表情及外观变化的完整频谱，实现针对颈部区域个性化且解剖学一致的控制。为构建HACK模型，我们采集了涵盖多种面部表情的头颈多模态综合数据集。采用三维超声成像方案提取内部生物力学结构，即颈椎七块椎体的精确三维旋转信息。继而通过多视角光度测量方法，捕捉具有多样静态表情与连续头颈运动的多个受试者的几何形态与基于物理的纹理。基于该多模态数据集，我们通过将三维头颈表征分离为中性表情与基准骨骼姿态下的形状、姿态、表情及喉部混合变形，训练参数化HACK模型。对颈椎区域采用解剖学一致的骨骼设计，并将表情与面部动作单元关联以支持艺术家友好型控制。HACK将头颈视为统一整体，提供更精准、富有表现力的控制，尤其在颈部区域实现全新层次的真实感。该方法对多项应用具有显著优势，并可实现头颈间细粒度运动合成与迁移的互相关分析。