Recently 3D-aware GAN methods with neural radiance field have developed rapidly. However, current methods model the whole image as an overall neural radiance field, which limits the partial semantic editability of synthetic results. Since NeRF renders an image pixel by pixel, it is possible to split NeRF in the spatial dimension. We propose a Compositional Neural Radiance Field (CNeRF) for semantic 3D-aware portrait synthesis and manipulation. CNeRF divides the image by semantic regions and learns an independent neural radiance field for each region, and finally fuses them and renders the complete image. Thus we can manipulate the synthesized semantic regions independently, while fixing the other parts unchanged. Furthermore, CNeRF is also designed to decouple shape and texture within each semantic region. Compared to state-of-the-art 3D-aware GAN methods, our approach enables fine-grained semantic region manipulation, while maintaining high-quality 3D-consistent synthesis. The ablation studies show the effectiveness of the structure and loss function used by our method. In addition real image inversion and cartoon portrait 3D editing experiments demonstrate the application potential of our method.
翻译:近年来,基于神经辐射场的3D感知GAN方法发展迅速。然而,当前方法将整幅图像建模为统一的神经辐射场,这限制了合成结果的局部语义可编辑性。由于NeRF逐像素渲染图像,因此可在空间维度对NeRF进行分割。我们提出了一种组合神经辐射场(CNeRF),用于语义3D感知肖像合成与操控。CNeRF按语义区域分割图像,为每个区域学习独立的神经辐射场,最终融合各区域并渲染完整图像。由此,我们可在保持其他部分不变的情况下,独立操控合成图像的语义区域。此外,CNeRF还被设计为在每个语义区域内解耦形状与纹理。与最先进的3D感知GAN方法相比,我们的方法在保持高质量3D一致性合成的同时,实现了细粒度的语义区域操控。消融实验证明了我们方法所采用的结构与损失函数的有效性。此外,真实图像反演和卡通肖像3D编辑实验展示了我们方法的实际应用潜力。