Sensitivity to severe occlusion and large view angles limits the usage scenarios of the existing monocular 3D dense face alignment methods. The state-of-the-art 3DMM-based method, directly regresses the model's coefficients, underutilizing the low-level 2D spatial and semantic information, which can actually offer cues for face shape and orientation. In this work, we demonstrate how modeling 3D facial geometry in image and model space jointly can solve the occlusion and view angle problems. Instead of predicting the whole face directly, we regress image space features in the visible facial region by dense prediction first. Subsequently, we predict our model's coefficients based on the regressed feature of the visible regions, leveraging the prior knowledge of whole face geometry from the morphable models to complete the invisible regions. We further propose a fusion network that combines the advantages of both the image and model space predictions to achieve high robustness and accuracy in unconstrained scenarios. Thanks to the proposed fusion module, our method is robust not only to occlusion and large pitch and roll view angles, which is the benefit of our image space approach, but also to noise and large yaw angles, which is the benefit of our model space method. Comprehensive evaluations demonstrate the superior performance of our method compared with the state-of-the-art methods. On the 3D dense face alignment task, we achieve 3.80% NME on the AFLW2000-3D dataset, which outperforms the state-of-the-art method by 5.5%. Code is available at https://github.com/lhyfst/DSFNet.
翻译:对严重遮挡和大视角的敏感性限制了现有单目三维密集人脸对齐方法的使用场景。当前基于3DMM的最优方法直接回归模型系数,未能充分利用可为人脸形状和朝向提供线索的低层二维空间与语义信息。本文展示了如何在图像空间和模型空间中联合建模三维人脸几何结构以解决遮挡和视角问题。我们并非直接预测整张人脸,而是首先通过密集预测回归可见人脸区域的图像空间特征。随后,基于可见区域的回归特征预测模型系数,利用可变形模型中的人脸整体几何先验知识补全不可见区域。我们进一步提出融合网络,结合图像空间与模型空间预测的优势,在无约束场景下实现高鲁棒性与高精度。得益于所提出的融合模块,该方法不仅对遮挡及大俯仰和横滚视角具有鲁棒性(图像空间方法的优势),而且对噪声和大偏航视角(模型空间方法的优势)也能保持稳健。全面评估表明,本方法性能优于现有最优方法。在三维密集人脸对齐任务中,我们在AFLW2000-3D数据集上实现了3.80%的NME,较当前最优方法提升了5.5%。代码开源于https://github.com/lhyfst/DSFNet。