In recent years, novel view synthesis from a single image has seen significant progress thanks to the rapid advancements in 3D scene representation and image inpainting techniques. While the current approaches are able to synthesize geometrically consistent novel views, they often do not handle the view-dependent effects properly. Specifically, the highlights in their synthesized images usually appear to be glued to the surfaces, making the novel views unrealistic. To address this major problem, we make a key observation that the process of synthesizing novel views requires changing the shading of the pixels based on the novel camera, and moving them to appropriate locations. Therefore, we propose to split the view synthesis process into two independent tasks of pixel reshading and relocation. During the reshading process, we take the single image as the input and adjust its shading based on the novel camera. This reshaded image is then used as the input to an existing view synthesis method to relocate the pixels and produce the final novel view image. We propose to use a neural network to perform reshading and generate a large set of synthetic input-reshaded pairs to train our network. We demonstrate that our approach produces plausible novel view images with realistic moving highlights on a variety of real world scenes.
翻译:近年来,得益于三维场景表示和图像修复技术的快速发展,基于单张图像的新视角合成取得了显著进展。尽管现有方法能够合成几何一致的新视角,但它们通常无法正确处理视角相关的渲染效果。具体而言,合成图像中的高光往往像粘贴在物体表面,导致生成的新视角缺乏真实感。针对这一核心问题,我们提出关键见解:新视角合成过程需要根据新相机参数调整像素的着色效果,并将其映射至合理位置。因此,我们建议将视图合成过程拆解为像素重着色和像素重定位两个独立任务。在重着色过程中,我们以单张图像为输入,根据目标相机参数调整其着色效果。随后将重着色图像输入现有视图合成方法,通过像素重定位生成最终的新视角图像。我们提出使用神经网络执行重着色任务,并通过构建大规模合成输入-重着色图像对训练网络。实验表明,该方法能在多种真实场景中生成具有真实动态高光的合理新视角图像。