This paper presents a Geometric-Photometric Joint Alignment~(GPJA) method, which aligns discrete human expressions at pixel-level accuracy by combining geometric and photometric information. Common practices for registering human heads typically involve aligning landmarks with facial template meshes using geometry processing approaches, but often overlook dense pixel-level photometric consistency. This oversight leads to inconsistent texture parametrization across different expressions, hindering the creation of topologically consistent head meshes widely used in movies and games. GPJA overcomes this limitation by leveraging differentiable rendering to align vertices with target expressions, achieving joint alignment in both geometry and photometric appearances automatically, without requiring semantic annotation or pre-aligned meshes for training. It features a holistic rendering alignment mechanism and a multiscale regularized optimization for robust convergence on large deformation. The method utilizes derivatives at vertex positions for supervision and employs a gradient-based algorithm which guarantees smoothness and avoids topological artifacts during the geometry evolution. Experimental results demonstrate faithful alignment under various expressions, surpassing the conventional non-rigid ICP-based methods and the state-of-the-art deep learning based method. In practical, our method generates meshes of the same subject across diverse expressions, all with the same texture parametrization. This consistency benefits face animation, re-parametrization, and other batch operations for face modeling and applications with enhanced efficiency.
翻译:本文提出了一种几何-光度联合对齐方法,该方法通过结合几何与光度信息,在像素级精度上对齐离散的人类表情。常见的人头配准方法通常采用几何处理方式将特征点与面部模板网格对齐,但往往忽略了密集像素级的光度一致性。这种疏忽导致不同表情间的纹理参数化不一致,阻碍了电影和游戏中广泛使用的拓扑一致头部网格的创建。GPJA通过利用可微分渲染将顶点与目标表情对齐,克服了这一局限,实现了几何与光度外观的自动联合对齐,且无需语义标注或预对齐网格进行训练。该方法采用整体渲染对齐机制和多尺度正则化优化,确保在大形变情况下的鲁棒收敛。该方法利用顶点位置的导数进行监督,并采用基于梯度的算法,保证几何演化过程中的平滑性并避免拓扑伪影。实验结果表明,该方法在各种表情下均能实现精确对齐,超越了传统的基于非刚性ICP的方法以及最先进的基于深度学习的方法。在实际应用中,本方法能够为同一主体生成不同表情下的网格,且所有网格均保持相同的纹理参数化。这种一致性有利于面部动画、重参数化以及其他面部建模与应用中的批处理操作,从而提升效率。