We propose a robust method for learning neural implicit functions that can reconstruct 3D human heads with high-fidelity geometry from low-view inputs. We represent 3D human heads as the zero level-set of a composed signed distance field that consists of a smooth template, a non-rigid deformation, and a high-frequency displacement field. The template represents identity-independent and expression-neutral features, which is trained on multiple individuals, along with the deformation network. The displacement field encodes identity-dependent geometric details, trained for each specific individual. We train our network in two stages using a coarse-to-fine strategy without 3D supervision. Our experiments demonstrate that the geometry decomposition and two-stage training make our method robust and our model outperforms existing methods in terms of reconstruction accuracy and novel view synthesis under low-view settings. Additionally, the pre-trained template serves a good initialization for our model to adapt to unseen individuals.
翻译:我们提出了一种鲁棒的学习神经隐式函数方法,可在低视角输入下重建具有高保真几何的三维人头。我们将三维人头表示为复合符号距离场的零水平集,该场由平滑模板、非刚性变形和高频位移场组成。模板表示身份无关且表情中性的特征,与变形网络一起在多个个体上进行训练。位移场编码身份相关的几何细节,为每个特定个体独立训练。我们采用由粗到精的策略,分两个阶段训练网络,无需三维监督。实验表明,几何分解与两阶段训练使方法具有鲁棒性,且模型在低视角设置下的重建精度和新视角合成方面优于现有方法。此外,预训练模板为模型适应未见过的个体提供了良好的初始化。