4D head capture aims to generate dynamic topological meshes and corresponding texture maps from videos, which is widely utilized in movies and games for its ability to simulate facial muscle movements and recover dynamic textures in pore-squeezing. The industry often adopts the method involving multi-view stereo and non-rigid alignment. However, this approach is prone to errors and heavily reliant on time-consuming manual processing by artists. To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images. Specifically, we first represent the time-series faces as a set of dynamic 3D Gaussians with fixed topology in which the Gaussian centers are bound to the mesh vertices. Afterward, we perform alternative geometry and texture optimization frame-by-frame for high-quality geometry and texture learning while maintaining temporal topology stability. Finally, we can extract dynamic facial meshes in regular wiring arrangement and high-fidelity textures with pore-level details from the learned Gaussians. Extensive experiments show that our method achieves superior results than the current SOTA face reconstruction methods both in the quality of meshes and textures. Project page: https://xuanchenli.github.io/Topo4D/.
翻译:4D头部捕捉旨在从视频中生成动态拓扑网格及相应的纹理贴图,因其能够模拟面部肌肉运动并恢复毛孔挤压过程中的动态纹理,在电影和游戏领域得到广泛应用。业界通常采用涉及多视角立体视觉与非刚性对齐的方法。然而,该方法易产生误差,且高度依赖艺术家耗时的后期手动处理。为简化此流程,我们提出Topo4D——一种用于自动生成几何与纹理的新型框架,可直接从标定的多视角时序图像中优化得到密集对齐的4D头部模型与8K纹理贴图。具体而言,我们首先将时序人脸表示为具有固定拓扑的动态3D高斯集合,其中高斯中心被绑定至网格顶点。随后,我们逐帧执行交替的几何与纹理优化,以进行高质量的几何与纹理学习,同时保持时序拓扑稳定性。最终,我们可以从学习到的高斯模型中提取具有规则布线排布的动态面部网格以及具备毛孔级细节的高保真纹理。大量实验表明,我们的方法在网格与纹理质量上均优于当前最先进的人脸重建方法。项目页面:https://xuanchenli.github.io/Topo4D/。