In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems. For this, we propose to exploit the camera motion priors in NVS to model view-dependent appearance or effects (VDE) as the negative disparity in the scene. By recognizing specularities "follow" the camera motion, we infuse VDEs into the input images by aggregating input pixel colors along the negative depth region of the epipolar lines. Also, we propose a `relaxed volumetric rendering' approximation that allows computing the densities in a single pass, improving efficiency for NVS from single images. Our method can learn single-image NVS from image sequences only, which is a completely self-supervised learning method, for the first time requiring neither depth nor camera pose annotations. We present extensive experiment results and show that our proposed method can learn NVS with VDEs, outperforming the SOTA single-view NVS methods on the RealEstate10k and MannequinChallenge datasets.
翻译:本文首次将视差效应引入基于单幅图像的新视角合成(NVS)问题。为此,我们提出利用NVS中的相机运动先验,将视差外观或效应(VDE)建模为场景中的负视差。通过识别高光现象“跟随”相机运动的特性,我们沿极线负深度区域聚合输入像素颜色,将VDE注入输入图像。此外,我们提出一种“松弛体渲染”近似方法,允许单次计算密度值,从而提高基于单幅图像的NVS效率。我们的方法仅通过图像序列即可学习单幅图像NVS,这是一种完全自监督的学习方法,首次无需深度或相机位姿标注。实验结果表明,所提方法能够学习具有VDE的NVS,在RealEstate10k和MannequinChallenge数据集上优于当前最先进的单视图NVS方法。