In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and the corresponding benchmark to evaluate single-view facial 3D reconstruction. By training on FaceScape data, a novel algorithm is proposed to predict elaborate riggable 3D face models from a single image input. FaceScape dataset releases $16,940$ textured 3D faces, captured from $847$ subjects and each with $20$ specific expressions. The 3D models contain the pore-level facial geometry that is also processed to be topologically uniform. These fine 3D facial models can be represented as a 3D morphable model for coarse shapes and displacement maps for detailed geometry. Taking advantage of the large-scale and high-accuracy dataset, a novel algorithm is further proposed to learn the expression-specific dynamic details using a deep neural network. The learned relationship serves as the foundation of our 3D face prediction system from a single image input. Different from most previous methods, our predicted 3D models are riggable with highly detailed geometry under different expressions. We also use FaceScape data to generate the in-the-wild and in-the-lab benchmark to evaluate recent methods of single-view face reconstruction. The accuracy is reported and analyzed on the dimensions of camera pose and focal length, which provides a faithful and comprehensive evaluation and reveals new challenges. The unprecedented dataset, benchmark, and code have been released at https://github.com/zhuhao-nju/facescape.
翻译:本文提出大规模精细三维人脸数据集FaceScape及其对应基准,用于评估单视图人脸三维重建。通过基于FaceScape数据训练,提出一种新算法,可从单张图像输入预测可驱动的精细三维人脸模型。FaceScape数据集包含16,940个带纹理的三维人脸,采集自847名受试者,每人具有20种特定表情。三维模型具备毛孔级几何细节,且经过拓扑统一处理。这些精细三维人脸模型可表示为用于粗略形状的三维可变形模型(3D Morphable Model)与用于精细几何的位移图(Displacement Map)。借助大规模高精度数据集的优势,进一步提出利用深度神经网络学习表情特异性动态细节的新算法。该学习关系构成了单图像输入三维人脸预测系统的基础。与多数现有方法不同,我们预测的三维模型在不同表情下均具有可驱动性与高度精细几何。同时,我们利用FaceScape数据生成野外与实验室环境下的基准,用于评估近年单视图人脸重建方法。报告并分析了相机姿态与焦距维度上的精度,提供了可信且全面的评估,揭示了新挑战。该史无前例的数据集、基准与代码已在https://github.com/zhuhao-nju/facescape开源。