Photorealistic avatars of human faces have come a long way in recent years, yet research along this area is limited by a lack of publicly available, high-quality datasets covering both, dense multi-view camera captures, and rich facial expressions of the captured subjects. In this work, we present Multiface, a new multi-view, high-resolution human face dataset collected from 13 identities at Reality Labs Research for neural face rendering. We introduce Mugsy, a large scale multi-camera apparatus to capture high-resolution synchronized videos of a facial performance. The goal of Multiface is to close the gap in accessibility to high quality data in the academic community and to enable research in VR telepresence. Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions. With a conditional VAE model serving as our baseline, we found that adding spatial bias, texture warp field, and residual connections improves performance on novel view synthesis. Our code and data is available at: https://github.com/facebookresearch/multiface
翻译:近年来,逼真的人脸化身技术取得了长足进步,但该领域的研究因缺乏公开可用的高质量数据集而受限,这类数据集需同时涵盖密集的多视角相机捕捉和丰富的被捕捉者面部表情。在本工作中,我们提出了Multiface:一个在Reality Labs Research从13个身份数据采集的新多视角高分辨率人脸数据集。我们介绍了Mugsy,一种大规模多相机装置,用于捕捉面部表现的高分辨率同步视频。Multiface的目标是缩小学术界对高质量数据可获取性的差距,并推动VR远程存在的研究。与数据集发布同步,我们开展了消融实验,研究不同模型架构对新颖视角和表情插值能力的影响。以条件变分自编码器模型为基线,我们发现加入空间偏置、纹理扭曲场和残差连接可提升新颖视角合成的性能。我们的代码和数据已开源至:https://github.com/facebookresearch/multiface