We propose pix2pix3D, a 3D-aware conditional generative model for controllable photorealistic image synthesis. Given a 2D label map, such as a segmentation or edge map, our model learns to synthesize a corresponding image from different viewpoints. To enable explicit 3D user control, we extend conditional generative models with neural radiance fields. Given widely-available monocular images and label map pairs, our model learns to assign a label to every 3D point in addition to color and density, which enables it to render the image and pixel-aligned label map simultaneously. Finally, we build an interactive system that allows users to edit the label map from any viewpoint and generate outputs accordingly.
翻译:我们提出pix2pix3D,一种用于可控逼真图像合成的三维感知条件生成模型。给定二维标签图(如分割图或边缘图),我们的模型学习从不同视角合成对应的图像。为实现显式的三维用户控制,我们将条件生成模型与神经辐射场相结合。利用广泛可用的单目图像与标签图像对,我们的模型除了颜色和密度外,还学会为每个三维点分配标签,从而能够同时渲染图像与像素对齐的标签图。最终,我们构建了一个交互式系统,允许用户从任意视角编辑标签图并生成相应的输出。