We present IntrinsicAvatar, a novel approach to recovering the intrinsic properties of clothed human avatars including geometry, albedo, material, and environment lighting from only monocular videos. Recent advancements in human-based neural rendering have enabled high-quality geometry and appearance reconstruction of clothed humans from just monocular videos. However, these methods bake intrinsic properties such as albedo, material, and environment lighting into a single entangled neural representation. On the other hand, only a handful of works tackle the problem of estimating geometry and disentangled appearance properties of clothed humans from monocular videos. They usually achieve limited quality and disentanglement due to approximations of secondary shading effects via learned MLPs. In this work, we propose to model secondary shading effects explicitly via Monte-Carlo ray tracing. We model the rendering process of clothed humans as a volumetric scattering process, and combine ray tracing with body articulation. Our approach can recover high-quality geometry, albedo, material, and lighting properties of clothed humans from a single monocular video, without requiring supervised pre-training using ground truth materials. Furthermore, since we explicitly model the volumetric scattering process and ray tracing, our model naturally generalizes to novel poses, enabling animation of the reconstructed avatar in novel lighting conditions.
翻译:本文提出IntrinsicAvatar,一种从单目视频中恢复着装人体虚拟形象本征属性(包括几何、反照率、材质和环境光照)的新方法。基于神经渲染的人体重建技术近期取得进展,已能仅从单目视频实现着装人体的高质量几何与外观重建。然而,这些方法将反照率、材质和环境光照等本征属性烘焙至单一纠缠的神经表征中。另一方面,仅有少数研究致力于从单目视频估计着装人体的几何与解耦外观属性,这些方法通常通过学习的MLP近似次级着色效果,导致重建质量和解耦程度有限。本工作中,我们提出通过蒙特卡洛光线追踪显式建模次级着色效应。我们将着装人体的渲染过程建模为体散射过程,并将光线追踪与人体关节运动相结合。该方法无需使用真实材质数据进行监督预训练,即可从单目视频恢复高质量的着装人体几何、反照率、材质与光照属性。此外,由于显式建模了体散射过程与光线追踪,我们的模型能自然泛化至新姿态,实现重建虚拟形象在新光照条件下的动画合成。