Creating high-fidelity head avatars from multi-view videos is a core issue for many AR/VR applications. However, existing methods usually struggle to obtain high-quality renderings for all different head components simultaneously since they use one single representation to model components with drastically different characteristics (e.g., skin vs. hair). In this paper, we propose a Hybrid Mesh-Gaussian Head Avatar (MeGA) that models different head components with more suitable representations. Specifically, we select an enhanced FLAME mesh as our facial representation and predict a UV displacement map to provide per-vertex offsets for improved personalized geometric details. To achieve photorealistic renderings, we obtain facial colors using deferred neural rendering and disentangle neural textures into three meaningful parts. For hair modeling, we first build a static canonical hair using 3D Gaussian Splatting. A rigid transformation and an MLP-based deformation field are further applied to handle complex dynamic expressions. Combined with our occlusion-aware blending, MeGA generates higher-fidelity renderings for the whole head and naturally supports more downstream tasks. Experiments on the NeRSemble dataset demonstrate the effectiveness of our designs, outperforming previous state-of-the-art methods and supporting various editing functionalities, including hairstyle alteration and texture editing.
翻译:从多视角视频中创建高保真头部化身是许多AR/VR应用的核心问题。然而,现有方法通常难以同时为所有不同的头部组件获取高质量的渲染结果,因为它们使用单一表示来建模具有截然不同特性的组件(例如皮肤与头发)。本文提出了一种混合网格-高斯头部化身(MeGA),它使用更合适的表示来建模不同的头部组件。具体而言,我们选择增强的FLAME网格作为面部表示,并预测一个UV位移图来提供逐顶点偏移,以改善个性化的几何细节。为了实现照片级真实感的渲染,我们使用延迟神经渲染获取面部颜色,并将神经纹理解耦为三个有意义的组成部分。对于头发建模,我们首先使用3D高斯泼溅构建静态的规范头发。进一步应用刚性变换和基于MLP的变形场来处理复杂的动态表情。结合我们提出的遮挡感知混合方法,MeGA能够为整个头部生成更高保真度的渲染结果,并自然地支持更多下游任务。在NeRSemble数据集上的实验证明了我们设计的有效性,其性能优于先前的最先进方法,并支持包括发型更改和纹理编辑在内的多种编辑功能。