Following the remarkable success of diffusion models on image generation, recent works have also demonstrated their impressive ability to address a number of inverse problems in an unsupervised way, by properly constraining the sampling process based on a conditioning input. Motivated by this, in this paper, we present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image. We start by leveraging a high-quality UV dataset of facial reflectance (diffuse and specular albedo and normals), which we render under varying illumination settings to simulate natural RGB textures and, then, train an unconditional diffusion model on concatenated pairs of rendered textures and reflectance components. At test time, we fit a 3D morphable model to the given image and unwrap the face in a partial UV texture. By sampling from the diffusion model, while retaining the observed texture part intact, the model inpaints not only the self-occluded areas but also the unknown reflectance components, in a single sequence of denoising steps. In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent reflectance estimation. Through a series of qualitative and quantitative comparisons, we demonstrate superior performance in both texture completion as well as reflectance reconstruction tasks.
翻译:继扩散模型在图像生成领域取得显著成功之后,近期研究亦表明,通过基于条件输入适当约束采样过程,扩散模型能够以无监督方式解决一系列逆问题。受此启发,本文首次提出利用扩散模型作为先验,从单张图像实现高精度的人脸三维双向反射分布函数重建。我们首先构建高质量的面部反射率UV数据集(包含漫反射与镜面反射反照率及法线),通过在不同光照条件下渲染以模拟自然RGB纹理,随后在渲染纹理与反射率分量的拼接对集合上训练无条件扩散模型。在测试阶段,我们向给定图像拟合三维形变模型,并将人脸展开为部分UV纹理。通过从扩散模型采样、同时保持观测纹理部分完整,模型能在单次去噪序列中不仅补全自遮挡区域,还能复原未知的反射率分量。与现有方法相比,我们直接从输入图像获取观测纹理,从而获得更忠实且一致的反射率估计。通过系列定性与定量比较,我们证明了在纹理补全与反射率重建任务中的优越性能。